Frequently Asked Questions
Q. Queryplex is cool, but what do all the terms mean – nodes, constellation, data sources and federation?
Nodes. These are the devices you want to query. They can be small (IoT sensor) or large (an entire Big Data cluster).
Contellation. This is the collection of nodes that you will query. Although Queryplex does give you the ability to query an individual node, the default behavior is that you query an entire constellation at once. All data in the constellation appears to your application as if it is in a single repository.
Data source. This is a source repository located on a node. For example, you may have a server (node) that contains a relational database such as Db2. The Db2 database is the data source. It’s certainly possible and supported to have multiple data sources on a node. For example, you could have a node with two Db2 databases, and a PostgreSQL database at the same time.
Federation. Federation is the abstraction that makes data on data sources available to queries in Queryplex. For example, if you have a data source (such as PostgreSQL) that has a two tables within it, Table1 and Table2, you can select one of these tables, but not the other, to be made available in the constellation for queries. To do this, visit “Configuration” then “Table Federation”.
Q. How do I get started?
First, to start using IBM Queryplex, please join our trials! We are in trial mode in the months ramping up to our general availability. Contact us here to get started.
Once you have an account, the first thing you’ll need to do is to log in to the user console and begin defining your constellation. A constellation defines the collection of data nodes you want to query. Constellations are the fundamental query scope of Queryplex. They define which data nodes are to be accessed by a query, which data sources within those data nodes should be accessible. For example, a data node could be a relational database, or a Big Data cluster, or a server that has some interesting Excel files on it, or an IoT device. Within each data source you will have the ability to select subsets of the data on each source to be included. For example, you may not want to expose all tables or files on the data source. The console will walk you through how to deploy the Queryplex node software on you data sources, and then define which data is to be included from each data source. We know you may have a large number of data sources, so the console is specifically designed to make this a scalable and simple process for all of us mere humans.
Q. Is data that I access through IBM Queryplex secure?
Absolutely secure! Data flows through the IBM Queryplex network using 128 bit encryption, the highest form of encryption available.
Q. Is data that I access through IBM Queryplex stored or replicated into the cloud?
No data in the cloud! IBM Queryplex flows intermediate query results and final query results through the network, but it never stores data anywhere.
Q. How does IBM Queryplex create the “computational mesh” that enables all of the participating CPUs to collaborate?
We’d like to tell you that, but then we’d have to….
Q. How much does it cost?
We’ve just started trials of the new technology – and as a participant in the trials it will completely free for the duration of the trial. If you are interested, please contact us about trials.
IBM Queryplex will not expensive, and the actual costs will depend on your usage. We want you to have the flexibility to use only what you need and pay only for what you use. The actual costs depend on the number and size (data volume) of the data sources you connect.
Q. The performance is too good to be true! How is IBM Queryplex doing this?
Honestly we have a lot of clever engineering going on in IBM Queryplex! We’ve told you about he computational mesh. Second, we have some of the world’s finest distributed access planning in the query compiler. This makes it possible to plan execution strategies for even the most complex enterprise class queries. Third, our unique organically formed constellation strategy ensures that the constellation remains as narrow as mathematically possible, minimizing network hops as data flows through the constellation, while still being sensitive to inter-node latency and geospatial properties. Yup, in short it’s magic.