Save time and effort sourcing top tech talent

Data Scientist, Big Data AI/ML

Philadelphia, PA, USA
Up to $125,000/ year
Data Scientist Machine Learning Engineer
Actively hiring

Data Scientist, Big Data AI/ML

Comcast
Philadelphia, PA, USA
Up to $125,000/ year
Data Scientist Machine Learning Engineer
Comcast
Actively hiring

hackajob is partnering with Comcast to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

 

This position is ineligible for visa sponsorship.  To be considered for this role, you must be legally authorized to work in the United States and not require sponsorship for employment now or in the future.

 

Who does the Data Scientist work with? 

The Data Engineering and Science team as part of our Next Generation Access Network is a diverse collection of professionals who work with a variety of organizational roles ranging from software engineering teams whose software integrates with analytics services, network architects and engineers, to service delivery engineers who provide support for our product, testers, operational stakeholders with all manner of information needs, and executives who rely on data for data based decision making.  

 

What are some interesting problems you’ll be working on? 

Develop models capable of processing millions of events per second and multi-billions of events per day, providing both a real time and historical view into the operation of our products and services. Work on high performance real time data stores and a massive historical data sets using best-of-breed and industry leading technology. Work closely with various engineering teams to solve key optimization, insight and access network data challenges. 

 

Where can you make an impact? 

The Comcast Next Generation Access Network Data Engineering and Science team is acquiring, studying, simulating, and modeling to enable data as a key driver and core functional component toward better understanding, predicting, and dynamically optimizing the access network to improve overall user experience. Success in this role is best enabled by a broad mix of skills and interests ranging from traditional distributed systems software engineering prowess to the multidisciplinary field of data science.  

 

Responsibilities: 

  • Building a strong intuitive understanding of the problem domain (Next Generation Access Networks). Identify testable hypotheses to explain interesting phenomena in this domain.

  • Selecting and transforming features, building and optimizing classifiers using machine learning techniques

  • Integrating data from multiple sources including third party sources.

  • Data mining using state-of-the-art methods

  • Enhancing data collection procedures to include information that is relevant for building analytic systems

  • Frequent meeting/communication with stakeholders to interpret their needs, plan/organize, and discuss progress and results

  • Developing actionable quantitative models in the areas of effectiveness, ROI, pricing and optimization.

  • Doing ad-hoc analysis and presenting results in a clear manner

  • Creating automated anomaly detection systems and constant tracking of its performance

  • Creating automated evaluation environment of complex models and constant tracking of relevant performance

  • Develop and communicate goals, strategies, tactics, project plans, timelines, and key performance metrics to reach goals

 

 

Here are some of the  specific technologies we use: 

  • Spark (AWS EMR, Databricks), AWS Lambda

  • Spark Streaming and Batch 

  • Avro, Parquet

  • Stream Data Platforms: Kafka, AWS Kinesis

  • MySQL, Cassandra, HBase, MongoDB, RDBMS

  • Caching Frameworks(ElasticCache/Redis) 

  • Elasticsearch, Beats, Logstash, Kibana

  • Java, Scala, Go, Python, R

  • Git, Maven, Gradle, Jenkins 

  • Rancher, Puppet, Concourse, Docker, Ansible, Kubernetes 

  • Linux 

  • Hadoop (HDFS, YARN, ZooKeeper, Hive), Presto, Athena

  • Keras, TensorFlow, Scikit.learn, Pandas)

  • Visualization suite (AWS Quicksight, Grafana)

 

Skills & Requirements: 

  • Graduate degree or Phd in the following areas: Statistics, Data Science, Computer Science or relevant science or engineering discipline.

  • 1+ years working within an enterprise data lake/warehouse environment or big data architecture

  • Understanding of machine learning techniques and algorithms, especially in the deep learning area -- both theoretical underpinnings and craft (Systems such as Tensorflow, Theano, Caffe, scikit.learn and their APIs). 

  • Applied statistics skills and understanding of probability distributions, statistical testing, regression, etc.

  • Experience with common data science toolkits, such as scikit-learn, R, etc.  Excellence in at least one of these is highly desirable.

  • Great communication skills.

  • Experience with data visualization tools, such as D3.js, GGplot, Matplotlib, etc.

  • Proficiency in using query languages such as SQL and Hive.

  • Experience with NoSQL databases, such as MongoDB, Redis/ElasticCache, Cassandra, HBase 

  • Good scripting and programming skills, such as Java, Scala, R, Python, or Spark 

  • Data-oriented personality

hackajob is partnering with Comcast to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

 

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Ready to reach your potential?