The data platform Bigeye released Deltas, a new feature that enables data teams to automatically compare and validate data sets. Delta replaces SQL queries, manual spreadsheet matching, and stand-alone Python scripts with automated comparisons and instant validation. This adds speed and reliability to key parts of the data management process, whether it’s migrating data to the cloud (or between clouds), replicating data, or promoting data from staging to production.
The founders of Bigeye, Kyle Kirwan and Egor Gryaznov, managed Uber’s first data warehouse for reporting and data analysis. Kirwan and Gryaznov moved on to Bigeye in 2019 with the intention of solving what they saw as an industry-wide issue – data reliability.
When moving data, all sorts of problems can occur, including delayed ingestion, lost or duplicate records, and mutated values. Comparing datasets is a crucial step for many data engineering projects, but it is often difficult and time consuming due to the need for custom SQL queries, complex and overloaded spreadsheets, or custom Python scripts.
“We have designed Bigeye to be an expandable framework that allows us to apply data observation to all kinds of exciting use cases. We started by enabling data teams to automatically detect data quality and data pipeline issues. Now with Delta’s customers easily compare and validate datasets, ”said Gryaznov.
Accurate data comparison means accurate data migration
Udacity, a US for-profit offering online courses, uses Bigeye to automate monitoring and detection of anomalies and create SLAs to ensure data quality and reliable data pipelines. Udacity has a strong data culture and we have hundreds of datasets with new additions and enhancements released weekly. The ability to automatically compare datasets before they are promoted to production enables our team to apply best practices in software development, have greater trust our data, capture problems we would otherwise miss, and speed up our development process, ”says Simon Dong, chief data officer at Udacity.
Bigeye users can now identify inconsistencies between even complex data sets in seconds. Delta uses Bigeye’s query generation while driving to apply the same observability configuration to both data sets, regardless of the SQL dialects in their sources, and detects differences between them. Bigeye promises that Delta will alert customers to any issues that arise when moving data from A to B.
Marketplace requirements for secure data management
After announcing on September 23 that Bigeye closed on a $ 45 million Series B round led by Coatue, the company wasted no time proving itself. Bigeye now has instant validation of datasets in addition to its other complementary products: automatic measurements, automatic thresholds and integrations. Will reliability combined with speed give Bigeye an advantage over other data observation platforms? MonteCarlo offers operational analysis, and WhyLabs seems to be positioning itself to lead the way with AI innovation in data observation. However, companies like Instacart, Clubhouse and Udacity choose Bigeye to automate monitoring and anomaly detection and create SLAs to ensure data quality and reliable data pipelines.
Delta’s expands Bigeye’s data observation platform, making it easy to map a source and a target, intelligently apply data quality measurements, and detect operations and inconsistencies quickly. Gryaznov added: “We look forward to enabling more groundbreaking user workflows through data observation in the near future.”
VentureBeat’s mission is to be a digital marketplace for tech makers to learn about transformative technology and trade. Our site provides essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to join our community to access:
- updated information on topics of interest to you
- our newsletters
- gated thoughtful content and reduced access to our valued events, such as Transformation 2021: Learn more
- networking features and more