This page describes DataPM, how it works, why you need it, and the license granted to you.
What is DataPM?
DataPM stands for Data Package Manager, and it helps publishers and consumers of data more easily transact. Think of DataPM like homebrew, pip, maven, apt, or yum - but for data.
DataPM helps you quickly search, review, and obtain data packages. And DataPM delivers that data to your target repository for you - no ETL scripting!
DataPM works with batch data, append log data, and in the near future will work with streaming data. And DataPM supports public and private data exchange.
Who uses DataPM?
Data Scientists, Data Engineers, Researchers, Analysts, and any other data-oriented knowledge workers use DataPM to discover new data, exchange data, collaborate, and build working catalogs of data.
How do people interact with DataPM?
Data publishers, such as researchers or analyst, use simple DataPM tools that automatically read existing data sets, and produce a Data Package File. The publisher then enhances this package file with documentation and other supporting materials. Finally, with a single command, the package file is published to a public or private DataPM registry.
Data consumers, such as Data Scientists or Financial Managers, use the public or private DataPM registries to obtain data packages. These consumers can search for topics of interest, subscribe to publishers or package files, and of course fetch or stream those data packages as needed.
Data Engineers can use DataPM tools to dramatically simplify and automate the publishing and consuming of data packages. DataOps (like DevOps) can greatly improve the quality of a publishing and consuming process!
What does DataPM cost?
DataPM is an ecosystem of free and (soon to be) open source tools that are used to publish and consume free and non-free data packages. Data publishers define the pricing for their own data packages.
DataPMs data publishing, consuming, and registry tools are free.
In the future we may offer paid tools that help you better understand the contents of data.
Is DataPM secure?
DataPM is based on open source tools, and itself will be open source in the future. DataPM is specifically architected to be simple, and therefore easy to secure and maintain. And DataPM is flexible - allowing for a wide variety of use cases.
While DataPM's core focus is the proliferation of high quality data packages, it can also be used to track and inventory your own datasets. Implementing DataPM in your organization will give you visibility into who has access to each of your data sets - without making copies of your data, and without changing your existing architecture. Think more visibility, more control.
What is the DataPM software license?
DataPM is offered under the DataPM License V1