what is datageneous?
datageneous is a data devops managed service that can be delivered: -on premises (with ssh access) -via private or hybrid cloud.
what do you mean by fully managed service?
We handle everything and let you focus on your data. No more stringing together and maintaining services, running patches, managing containers, fixing boxes, making DNS changes, upgrading, release management, issue management, etc. We will work with you to define and deliver SLAs.
what is the technology behind datageneous?
At its core datageneous is a federatable, distributable, versioned, streaming, RDF graph store and linked data platform. It combines aspects of semantic web technologies, robust data versioning, stream processing, and distributed database technology similar to distributed ledger technology (DLT). You can read about our service here. We are extending this core technology into a notebook programming and direct manipulation environment (jupyterlabs) with native interfaces to ML capabilities.
what is rdf?
The Resource Description Framework (RDF) is a framework for representing information in the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed. you can read more about RDF here.
how can I use it?
datageneous can be used in several ways:
- through standard W3C standard compliant SPARQL and GSP interfaces
- through our evolving UX
what is sparql?
SPARQL is the standard query language for RDF. datagenous supports SPARQL 1.1 query and update. Like SQL for relational databases, the varying implementations often differ slightly, or have a few extensions, but the basics are the same.
You can read about SPARQL here.
SPARQL and datagenous
Every datagenous repository has a standard SPARQL endpoint at
you can read our detailed documentation here.
what is GraphQL?
GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. You can read more about GraphQL here.
why did you build it?
We think there are all sorts of problems in which data is a first class element in development. Managing data for these problems is too expensive, complex, and difficult. In our own experience in organizations we’ve seen teams struggle with technical debt, complex opaque data pipelines, missing data provenance information, sprawling data services, and lack of agility. Often frontline teams are held hostage by rapidly expanding centralized systems unable to service increasingly dynamic business requirements. In a search for answers to address these concerns, we’ve seen too many organizations turn to IT vendors, only to be sold solutions that often compound problems, require substantial risky investments, lead to lock-in, and offer timelines years into a promised future that never seems to arrive.
We think the best time to start doing data centric development is now, without trusting magic black boxes, getting locked in to vendor features, or sinking lots of money into risky, massive infrastructural systems that promise to eventually deliver the capabilities.
ok but why did you build it?
datageneous has its origins supporting distributed C4ISR systems. It was originally conceived of as a way to more effectively manage, participate in, and run complex operations such as emergency humanitarian relief in which there was a premium on speed to integration across numerous organizational boundaries, often resistant to centralized control, dealing with heterogeneous data ranging from sensors to legacy systems. Part of the workflow for operations of this kind included the integration of decision support systems powered by computational techniques such as ML and AI. Because the common communication fabric across participants in these operations was increasingly the Internet and the WWW, our design patterns focused on supporting mission objectives in web native and standard conformant ways. datageneous is, in part, the outgrowth and evolution of research in these areas.
is anyone else doing this kind of work?
Yes. Commercially, internal teams at companies like Uber and Airbnb have developed their own platforms like michelangelo and ZipLine specifically for addressing the complexities of data driven development for internal business problems that rely on ML. We think these capabilities should be broadly available and simple to use. We also think there is lots more than just ML that is data driven, and that there are various domains ranging from complex supply chains to digitization that need better ways to address data complexity.
who is behind datagenous, and where are you based?
datgenous is a product of Datagraph GMBH. We are located in Berlin. You can read about our core team here.