A surprising revelation related to data warehousing was that Facebook uses a traditional data warehouse based on relational technology.
The news is interesting because Facebook is one of the representatives of the new generation of companies who deal with very large amounts of data and therefore don’t want relational technology because it is too cumbersome for their needs. They have been inventing new database technologies that can handle such large amounts of data, have fast response times and they typically don’t support the widely used query language SQL.
At the recent The Data Warehousing Institute conference, a representative from Facebook explained that they decided to use relational technology for their data warehouse because the right technology should be used for the right purpose.
Newer database technologies, especially those that utilize the Hadoop platform like Facebook does, are better at handling large amounts of unstructured data. Relational databases, on the other hand, require a more strict database structure that typically includes constraints on data elements and defines the dimensions by which users analyze data in the data warehouse.
Facebook decided to use both. Non-relational databases are used for analytical data discovery while the relational database represents a consistent data source for querying core business data.
Data warehouse practitioners like myself are confident that data warehouses will remain because the reasons for their need are still valid regardless of new technologies that pop up. Therefore I strongly agree with Facebook that relational technology is best for their data warehouse because it serves the intended purpose.