Definitions:
- Row KeyEach row has a unique Row Key
- Column FamilyUse to group different columns, defined when building the table
- Column QualifierIdentify each column, can be added in run-time
Expandable, may have different numbers of Qualifiers in each row - timestampA cell can have different versions of data based on different timestamps(for consistency)
A Row is referred by a Row Key
So a Column is referred by a Column Family and a Column Qualifier
A Cell is referred by a Row and a Column
A Cell will contain multiple version of values based on different timestamps
The above four things combines together to become a Key in the following format:
[Row Key]/[Column Family]:[Column Qualifier]/[timestamp]
With this Key, we can find the unique value we want to read.
Every value we get will be in byte[] format, developer need to know the structure of value themselves.
Rows are strongly consistency.
How does HBase db stores in file system?
- Group continues rows into the same Region
- Region is divided by each Column Family to be several units
- Each one of these divided unit stored as a single file in file system
- The values in each file are in lexicographical order(I think order by Column Qualifier)
So if we open one HBase db file, it will only have
- Continues rows
- Values in the same Column Family
- (I think)Values are ordered by Column Qualifier
No comments:
Post a Comment