Platform1: a custom data sharing environment
Data teams face unique challenges when it comes to managing and sharing data effectively. You may have scarce resources, you may only have on high CPU server, or you may not have your own machine. Cloud-based data products don’t give you enough control over the environment, they may saddle you to their proprietary software, or they may complicate the process by spreading your data across multiple servers and locations.
Platform1 is a shared environment designed for collaborative data processing and analysis. We aim to empower your organization with seamless collaboration, heavy processing capabilities, and uncompromising data security. The flexibility and scalability of cloud-based products meets the control and security of local solutions.
Seamlessly share datasets, collaborate with team members, and maintain version control—all within a secure and controlled environment. Platform1 is built upon a foundation of open-source and long-term supported software such as Linux, JupyterHub, RStudio Server, Git, VirtualBox, and Apache. The environment is a versatile and comprehensive solution for data collaboration. Designed to be installed on just about any Debian-based Linux machine, our team will configure Platform1 to ensure seamless integration with your existing infrastructure. This enables you to have full control over your data environment, from installation to operation, empowering you to manage your data on your terms.
How it works
We tap into the power of JupyterHub and RStudio Server, which are made to integrate and coordinate multiple concurrent user sessions in Jupyter and RStudio.
With individual logins for each software, your data scientists can work in their preferred environment for data analysis. The versatility of JupyterLab will allow them to access their own Linux terminal sessions for running other software. The JupyterHub user-admin can interface directly with RStudio Server’s, working together on the same machine.
No more emailing your data back and forth as .csvs and .xlsxs! Because we host all users on the same Linux server as full non-administrative users, the entire team can load and read data for analysis from one shared directory. The data stays on the same server throughout. We also can utilize Git and Github for version control, reviewing code with your collaborating scientists, and for code backups.
Adding more security
With Platform1, you can have peace of mind knowing that your data remains safe and secure, even in a shared collaborative environment.
Your private data remains on the same server throughout processing and collaboration. Because data packets containing private data never leave the system, co-opting the system would require compromising the entire system itself.
Platform1 operates on the principle of a single point of failure, bolstered by robust backups. However, we recognize that this alone may not be sufficient against malicious software. That’s why we operate Platform1 on a virtual server hosted on your main server, which is then run on a private network, which is then accessed by thin clients on your private network via remote desktop access. In other words, the only data entering your private network is your user’s keystrokes and clicks, while the only data leaving your private network is pixel data. Backups of the virtual server can be maintained and periodically exported to yet-another backup location, so if the system goes down from a bad update or a malicious attack, the whole system can be halted and deleted, with a backup reloaded in minutes.
What about the cloud?
We can make accommodations for teams that want to run Platform1 in a cloud environment. It can run on any cloud-hosted server that can operate on a Debian-based Linux operating system. In the future, we may be able to support running Platform1 in a Docker container, which would lead to more resource efficiency and scalability.