Collaborate with Innovative 3Mers Around the World
Choosing where to start and grow your career has a major impact on your professional and personal life, so it’s equally important you know that the company that you choose to work at, and its leaders, will support and guide you. With a diversity of people, global locations, technologies and products, 3M is a place where you can collaborate with other curious, creative 3Mers.
This position provides an opportunity to transition from other private, public, government or military experience to a 3M career.
The Impact You’ll Make in this Role
3M is looking for a skilled Unstructured Data Engineering Lead to join our team. As a key member of our organization, you will be responsible for leading the development of pipelines, preprocessing unstructured data, eliminating duplicate data and text noise, chunking data, and generating vector embeddings. In addition to these key capabilities, the candidate should possess strong Python programming skills, expertise in cloud engineering, and experience with open source software to drive innovation and efficiency in handling unstructured data. The ideal candidate will have a strong background in data engineering, particularly in handling unstructured data, and possess the capabilities to drive innovation and efficiency in data preprocessing tasks.
As an Unstructured Data Engineering Lead, you will have the opportunity to tap into your curiosity and collaborate with some of the most innovative and diverse people around the world. Here, you will make an impact by:
Leading the development of pipelines for preprocessing unstructured data, eliminating duplicate data and text noise, chunking data, and generating vector embeddings.
Implementing efficient and scalable solutions using Python programming skills and cloud engineering expertise to handle unstructured data effectively.
Determining the best approaches and techniques for data preprocessing tasks, driving innovation and efficiency in handling unstructured data.
Supporting the team by providing guidance, mentorship, and technical expertise in data engineering, particularly in the context of unstructured data.
By taking on this role, you will play a crucial part in driving the success of our organization's unstructured data initiatives and contribute to the advancement of data engineering practices.
Key Responsibilities:
Lead the design and development of data pipelines for processing and analyzing unstructured data.
Preprocess unstructured data to eliminate duplicate data, text noise, and other irrelevant information.
Chunk large volumes of unstructured data into manageable segments for efficient processing.
Develop and implement algorithms and techniques for generating vector embeddings from unstructured data.
Collaborate with data scientists, machine learning engineers, and domain experts to define data requirements and objectives.
Optimize data preprocessing and embedding generation pipelines for scalability and performance.
Leverage strong Python programming skills to develop efficient and reliable data engineering solutions.
Utilize cloud engineering expertise to design and implement scalable and cost-effective data processing architectures.
Explore and leverage open source software and tools to drive innovation and efficiency in handling unstructured data.
Stay up-to-date with the latest advancements in data engineering and unstructured data processing techniques.
Mentor and guide junior engineers, fostering a collaborative and innovative team environment.
Your Skills and Expertise
To set you up for success in this role from day one, 3M requires (at a minimum) the following qualifications:
Bachelor's degree or higher (completed and verified prior to start) in Computer Science or Engineering
Three (3) years of experience in unstructured data engineering at a large manufacturing company in a private, public, government or military environment
Three (3) years of experience as a data engineer, with expertise in handling unstructured data.
Additional qualifications that could help you succeed even further in this role include:
Master’s degree in Computer Science, Engineering, or related field from an accredited institution
Strong understanding of data engineering concepts and best practices.
Proficiency in Python programming, with the ability to develop efficient and reliable data engineering solutions.
Expertise in cloud engineering, with experience in designing and implementing scalable and cost-effective data processing architectures.
Familiarity with open source software and tools for data engineering and unstructured data processing.
Experience with data preprocessing techniques, including duplicate elimination, noise removal, and chunking.
Knowledge of algorithms and methods for generating vector embeddings from unstructured data.
Knowledge of distributed computing frameworks, such as Apache Spark or Hadoop.
Strong analytical and problem-solving skills, with the ability to optimize data processing pipelines.
Excellent communication and collaboration abilities, with the capacity to work effectively in cross-functional teams.
Ability to adapt to a fast-paced and dynamic environment
Work location:
Hybrid Eligible (Job Duties allow for some remote work but require travel to Maplewood, MN at least 2 days per week)
#LI-hybrid
St Paul, MN
3M Company develops, manufactures, and markets various products worldwide. It operates through four business segments: Safety & Industrial, Transportation & Electronics, Health Care, and Consumer. The Safety & Industrial segment offers personal safety products, adhesives and tapes, abrasives, closure and masking systems, electrical markets, automotive aftermarket, and roofing granules. The Transportation & Electronics provides electronics, such as display materials and systems, electronic materials solutions; automotive, aerospace, and commercial solutions; advanced materials; and transportation safety products.
This segment serves transportation and electronic original equipment manufacturer customers. The Health Care segment offers medical solutions, oral care, separation and purification sciences, health information systems, drug delivery systems, and food safety products. The Consumer segment provides home improvement, home care, and consumer health care products, as well as stationery and office supplies. This segment is also involved in the retail auto care business.
The company serves automotive, electronics and automotive electrification, appliance, paper and printing, packaging, food and beverage, construction, medical clinics and hospitals, pharmaceuticals, dental and orthodontic practitioners, health information systems, food manufacturing and testing, consumer and office retail, office business to business, home improvement, drug and pharmacy retail, and other markets. 3M Company was founded in 1902 and is headquartered in St. Paul, Minnesota.