Big data – Hadoop, HBase and MongoDB in TrakkBoard
TrakkBoard is a web based software as a service solution. All data is stored in a cloud of multiple servers, in order to ensure the timely processing. It is based on a Hadoop-cluster that is running in a secure environment with multiple backups and used for storing large amounts of data that can be requested through the numerous APIs. Part of Hadoop is HDFS (Hadoop Distributed File System), which is used as a highly available, high-performance file system for storing large amounts of data on the file systems across servers. Another module of Hadoop is MapReduce.
Why we use Hadoop:
Big Data: TrakkBoard is a flexible platform that daily queries and stores huge amounts of data through various APIs. Simple databases reach their limits soon and cannot process the data adequately. The volume of data stored in TrakkBoard steadily increases at a rapid rate, so the whole server architecture must be built on a high level of scalability.
Scalability: With expanding the cluster by adding more servers, Hadoop gives us the ability to scale very quickly. Every day thousands of accounts are aggregating data via APIs. The data is stored in the cluster, so it comes to an enormous amount of data.
Reliability: Hadoop is down? We have not yet experienced! Since the first day Hadoop runs reliably and without problems.
Data processing: Via the chart configurator, charts can be calculated very flexibly by using data from different data sources. For this purpose MapReduce makes enquiries to Hadoop to process data in the shortest possible time.
Universal data access: Because Hadoop is seen as a simple storage of raw data, various modules of TrakkBoard can access and process all data flexibly.
Open Source: Hadoop was developed under the Apache License and is available as an open source project. The global developer community ensures that Hadoop and the entire ecosystem continue to develop at a very high pace.