What is the best way to store a relation in main memory?
I am working on an application which is a mini DBMS design for evaluating SPJ queries. The program is being implemented in C++.
When I have to process a query for joins and group-by, I need to maintain a set of records in the main memory. Thus, I have to maintain temporary tables in main memory for executing the queries entered by the user.
My question is, what is the best way to achieve this in C++? What data structure do I need to make use of in order to achieve this?
In my application, I am storing data in binary files and using the Catalog (which contains the schema for all the existing tables), I need to retrieve data and process them.
I have only 2 datatypes in my application: int (4 Bytes) and char (1 Byte)
I can use std:: vector. In fact, I tried to use vector of vectors: the inner vector is used for storing attributes, but the problem is there can be many relation开发者_开发问答s existing in the database, and each of them may be any number of attributes. Also, each of these attributes can be either an int or a char. So, I am unable to identify what is the best way to achieve this.
Edit
I cannot use a struct for the tables because I do not know how many columns exist in the newly added tables, since all tables are created at runtime as per the user query. So, a table schema cannot be stored in a struct.
A Relation is a Set of Tuples (and in SQL, a Table is a Bag of Rows). Both in Relational Theory and in SQL, all tuples (/rows) in a relation (/table) "comply to the heading".
So it is interesting to make an object to store relations (/tables) consist of two components: an object of type "Heading" and a Set (/Bag) object containing the actual tuples (/rows).
The "Heading" object is itself a Mapping of attribute (/column) names to "declared data types". I don't know C, but in Java it might be something like Map<AttributeName,TypeName> or Map<AttributeName,Type> or even Map<String,String> (provided you can use those Strings to go get the actual 'Type' objects from wherever they reside).
The set of tuples (/rows) consists of members that are all a Mapping of attribute (/column) names to attribute Values, which are either int or String, in your case. Biggest problem here is that this suggests that you need something like Map<AttributeName,Object>, but you might get into trouble over your int's not being an object.
As a generic container for any table rows, I'd most likely use std::vector
(as pointed out by Iarsmans). As for the table columns, I'd most likely define those with structs representing the table schema. For example:
struct DataRow
{
int col1;
char col2;
};
typedef std::vector<DataRow> DataTable;
DataTable t;
DataRow dr;
dr.col1 = 1;
dr.col2 = 'a';
t.push_back(dr);
精彩评论