AI Data Engineer

Remote
Data Engineer Artificial Intelligence Engineer
Informa
Actively hiring

Sign up for the chance to get matched to this role, and similar opportunities.

The AI Data Engineer position will be responsible to shape, create and integrate AI solutions with particular focus on Natural Language Processing, Large Language Models and Multi-Modal Generative AI. The candidate will be working alongside other members of the team, Divisional Colleagues and Data Engineering teams to further drive AI development and innovation. The AI Engineer will work closely with data scientists, delivery leads, product managers, content matter experts, and technology teams to develop, automate and scale-up advanced AI solutions to address key customer problems whilst also helping to develop methodologies and reusable solutions.

As the AI Data Engineer, you will be responsible for providing technical expertise and leadership across all aspects of data engineering process, from data acquisition, tagging, embedding, protection and storage.

Your Role:

  • Integrate and Manage Data Sources: Work with a variety of data types including PDFs, Word documents, Excel files, HTML, audio, video, and text, as well as various databases, to integrate and manage data across the organization.
  • Data Preparation for AI Modeling: Prepare and preprocess data to ensure it is ready for use by AI Lead Engineers in machine learning models and AI solutions.
  • Develop and Maintain Data Pipelines: Utilize tools like Airflow and Python to develop and maintain efficient, scalable data pipelines that support the ingestion, transformation, and delivery of large datasets.
  • Manage Data Lakes and Warehouses: Oversee the organization's data lakes and warehouses, ensuring data is stored efficiently and is easily accessible for AI applications.
  • Implement Data Quality and Governance: Ensure high standards of data quality and implement governance practices to maintain the integrity and security of data.
  • Collaborate with AI Teams: Work closely with AI Lead Engineers and other stakeholders to understand data needs and contribute to the development of AI-driven solutions.
  • Stay Current with Emerging Technologies: Keep up-to-date with the latest developments in data engineering, machine learning, and AI technologies to continually enhance data capabilities.

Qualifications:

  • Around a couple of years of data engineering with a focus on AI and machine learning projects.
  • Technical Proficiency: Strong skills in Python, Airflow, data lakes, and data pipeline tools. Experience with data formats like JSON, Parquet, Avro, etc.
  • Knowledge of Data Processing Technologies: Familiarity with big data technologies (e.g., Hadoop, Spark, AWS EMR) and database management systems (e.g., MongoDB, PostgreSQL).
  • Experience with Data Integration Tools: Proficiency in ETL/ELT tools and practices.
  • Familiarity with AI and ML Concepts: Understanding of the data requirements for machine learning and AI, including experience with embeddings and vector databases.
  • Problem-Solving Skills: Ability to tackle complex data integration challenges and provide efficient solutions.
  • Communication Skills: Strong interpersonal skills with the ability to explain technical data concepts to non-technical stakeholders.
  • Education: Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field.
  • Experience with cloud platforms like AWS or Azure.
  • Familiarity with containerization technologies like Docker.
  • Knowledge of data security and privacy practices.
  • Around a couple of years of data engineering with a focus on AI and machine learning projects.
  • Technical Proficiency: Strong skills in Python, Airflow, data lakes, and data pipeline tools. Experience with data formats like JSON, Parquet, Avro, etc.
  • Knowledge of Data Processing Technologies: Familiarity with big data technologies (e.g., Hadoop, Spark, AWS EMR) and database management systems (e.g., MongoDB, PostgreSQL).
  • Experience with Data Integration Tools: Proficiency in ETL/ELT tools and practices.
  • Familiarity with AI and ML Concepts: Understanding of the data requirements for machine learning and AI, including experience with embeddings and vector databases.
  • Problem-Solving Skills: Ability to tackle complex data integration challenges and provide efficient solutions.
  • Communication Skills: Strong interpersonal skills with the ability to explain technical data concepts to non-technical stakeholders.
  • Education: Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field.
  • Experience with cloud platforms like AWS or Azure.
  • Familiarity with containerization technologies like Docker.
  • Knowledge of data security and privacy practices.

Sign up for the chance to get matched to this role, and similar opportunities.

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Ready to reach your potential?