Amundsen Data
Lyft s data warehouse is on hive and all physical partitions are stored in s3.
Amundsen data. An interview about the open source amundsen platform for data discovery and how lyft is using it to improve their analytics workflow. A pagerank inspired search algorithm recommends results based on names descriptions tags and querying viewing activity on the. Graph data model is not a common choice for most applications but we believe it is a great fit for amundsen as it deals with a lot of relationships among entities. Atmospheric data 21 6 meters high.
Amundsen real time data. Now let s talk technical. Data is only valuable if you use it for something and the first step is knowing that it is available. At the same time there s a lot of value a metadata driven solution can provide in the space of compliance in tracking personal data across the entire data infrastructure.
Started at lyft amundsen has made data engineers data analysts and data scientists 20 more productive. Search for data within your organization by a simple text search. Amundsen is a metadata driven application for improving the productivity of data analysts data scientists and engineers when interacting with data. Updated page utc 2020 08 22 12 59.
Amundsen chose the graph data model to represent its metadata. A graph data model is to represent relationship between entity vertex and relation edge. Amundsen now empowers all employees at lyft from new employees to the most experienced to become autonomous in their data discovery for their daily tasks. Lyft has built a data discovery platform amundsen which has worked really well in improving the productivity of its data scientists by faster data discovery.
Amundsen is built on 3 key pillars. As organizations grow and data sources proliferate it becomes difficult to keep track of everything. Augmented data graph amundsen uses a graph database under the hood to store relationships between various data assets tables dashboards protobuf events etc. How does it work.
Data discovery adds 30 more productivity to data scientists metadata is key to the next wave of big data applications amundsen lyft s metadata and data discovery platform. How amundsen democratizes data discovery showing the relevant data. Highly queried tables show up earlier. Amundsen is a data discovery and metadata engine for improving the productivity of data analysts data scientists and engineers when interacting with data.
Mosaic image from 360 camera on the ccgs amundsen. We will introduce amundsen which is an open source data discovery platform from lyft.