Capco – Collibra Data Lineage Automation Engineer – Contractor
Role Overview
Capco is seeking a Data Lineage Automation Engineer to design and implement automated end-to-end data lineage solutions across a complex enterprise data ecosystem. This role combines data engineering, metadata management, and AI/ML to enable lineage capture, traceability, and intelligent metadata extraction across cloud, legacy, and reporting systems.
Key Responsibilities
- Lead implementation of automated data lineage across hybrid environments:
- Cloud platforms (Snowflake, AWS)
- Legacy relational databases and ETLs
- NoSQL stores
- BI/reporting platforms (Tableau, Power BI)
- Implement/extend lineage frameworks like Spline, OpenLineage, or Marquez.
- Build connectors, extractors, or agents to bridge gaps in lineage capture.
- Integrate with metadata platforms (e.g., Collibra) to publish lineage in a usable format.
- Apply AI/ML techniques to infer lineage where automation is incomplete.
- Develop reusable lineage components for cross-domain operational use.
- Advise stakeholders on lineage standardization, storage, and operational best practices.
Required Skills & Experience
- Proven experience delivering automated data lineage solutions in hybrid architectures.
- Hands-on expertise with Spline, OpenLineage, Marquez, or comparable frameworks.
- Deep understanding of metadata capture, ETL tracing, and query execution mapping.
- Strong AI/ML background in metadata intelligence, code parsing (NLP), or pattern detection.
- Experience integrating lineage with governance tools (Collibra, Alation).
- Strong programming skills: Python, Scala, or Java.
- SQL expertise across systems: Snowflake, SQL Server, Oracle, MongoDB, etc.
Preferred Skills (Big Plus)
- Experience with commercial data lineage tools.
- Prior work in regulated industries (finance, healthcare).
- Familiarity with event-based architectures for real-time lineage.
- Knowledge of data mesh or domain-driven lineage approaches.
Ideal Candidate Profile
- Enterprise-scale automated lineage implementation experience.
- Operates at the intersection of data engineering, governance, and AI.
- Acts as a technical thought partner for architecture and governance teams.
- Automation-first, reuse-oriented mindset.
hackajob is partnering with hackajob on-demand to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.