Seagate Technology

Data Scientist/Data Engineering Summer Intern

Posted on: 18 Oct 2021

Cupertino, CA

Job Description

Seagate’s Quality Data Analytics and Tools Team is seeking a Data Scientitst/Engineering Intern for the Summer 2022. The intern will work closely with our team of Data Scientists and Engineers to help build ML models and create, enhance, improve, deploy, and maintain various workflows used to generate data for our modeling processes.

About the role - you will:

Build predictive and prescriptive ML models using Manufacturing, Process, and Field data
Design and build processes for deploying and monitoring spark ML jobs
Design queries and workflows for assembling large data sets in Hive and Presto
Facilitate in building scripts for automating general data engineering workflows
Participate in code reviews and analysis
Help develop and design general python packages 
Translate business needs into software requirements and execute on them 
Gain real world experience in software engineering, machine learning engineering, cloud computing, data engineering, and big data.

About you:

Strong interpersonal and communication skills in order to effectively contribute to technical teams and make presentations to a variety of technical and business personnel
Enjoys working with statistics or applied mathematics including machine learning concepts
Previous internship experiences is a plus

Your experience includes:

Pursuing a Master’s or Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Math, Physics, or other scientific disciplines and enrolled in Fall 2022 classes
Experience with statistics or applied mathematics including machine learning concepts
Understands basic software engineering practices like algorithmic runtime analysis, functional programming, concurrent programming, database design, scripting, and data structures
Demonstrates skills in at least one of the following programming languages: Python, SQL, Scala, Bash, or another scripting language.
Demonstrates skills in at least one of the following software frameworks and tools: Spark, Spark SQL, Presto, Hive, Spark MLlib, H20, or Apache Airflow
Experience with ETL, EL, and ELT pipelines is a plus

Seagate Technology

Cupertino, CA

Seagate Technology plc provides data storage technology and solutions in Singapore, the United States, the Netherlands, and internationally. The company offers hard disk and solid state drives, including serial advanced technology attachment, serial attached SCSI, and non-volatile memory express products; solid state hybrid drives; and storage subsystems.

Its products are used in enterprise servers and storage systems; and edge compute and non-compute applications. The company also provides enterprise data solutions portfolio comprising storage subsystems for enterprises, cloud service providers, and scale-out storage servers and original equipment manufacturers (OEMs).

In addition, it offers external storage solutions under the Seagate Backup Plus and Expansion product lines, as well as under the LaCie and Maxtor brands in capacities up to 168TB. The company sells its products primarily to OEMs, distributors, and retailers. Seagate Technology plc was founded in 1978 and is based in Dublin, Ireland.