Last year November, I got the opportunity to present at AIOUG Sangam, 2017. My session was titled as “Harness the Power of Data in a Big Data Lake”. The abstract is as below –
Data lake is relatively a new term when compared to all fancy ones since the industry realized the potential of data. Industry is planning their way out to adopt big data lake as the key data store but what challenges them is the traditional approach. Traditional approaches pertaining to data pipelines, data processing, data security still hold good but architects do need to leap an extra mile while designing big data lake.
This session will focus on this shift in approaches. We will explore what are the road blockers while setting up a data lake and how to size the key milestones. Health and efficiency of a data lake largely depends on two factors – data ingestion and data processing. Attend this session to learn key practices of data ingestion under different circumstances. Data processing for variety of scenarios will be covered as well.
Here is the link to my presentation –
The session was an excerpt from my upcoming book on Enterprise Data Lake. The book should be out within a month from now and is available at all online bookstores.
Amazon – https://www.amazon.com/Practical-Enterprise-Data-Lake-Insights/dp/1484235215/
When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data. Starting from sourcing data into the Hadoop ecosystem, you will go through stages that can bring up tough questions such as data processing, data querying, and security. Concepts such as change data capture and data streaming are covered. The book takes an end-to-end solution approach in a data lake environment that includes data security, high availability, data processing, data streaming, and more.
Each chapter includes application of a concept, code snippets, and use case demonstrations to provide you with a practical approach. You will learn the concept, scope, application, and starting point.
What You’ll find in the book
- Get to know data lake architecture and design principles
- Implement data capture and streaming strategies
- Implement data processing strategies in Hadoop
- Understand the data lake security framework and availability model