Job Category: Data Scientist
Job Type: Full Time
Job Location: Hybrid Noida - India

Minimum Experience: 7+ years

Mandatory Skill Set: Time Series, NLP, Computer Vision, LLM, Development and Deployment of ML models, Python

About Us

CLOUDSUFI is a Data Science and Product Engineering organization building Products and Solutions for Technology and Enterprise industries. We firmly believe in the power of data to transform businesses and make better decisions. We combine unmatched experience in business processes with cutting edge infrastructure and cloud services. We partner with our customers to monetize their data and make enterprise data dance.

Job Description

We are seeking an experienced and innovative Senior Data Scientist to join our team. The ideal candidate
will be responsible for leveraging advanced statistical and machine learning techniques to extract insights
from complex datasets and develop predictive models to support business objectives

Job Responsibilities

  • Partner with stakeholders to identify business challenges and translate them into actionable data science projects with a focus on AI and Generative AI applications. 
  • Design and conduct advanced data analysis, leveraging AI techniques like deep learning and natural language processing, to extract meaningful insights from complex datasets. 
  • Develop, train, and evaluate cutting-edge machine learning models, including Generative Adversarial Networks (GANs) and transformers, for tasks like image/text generation, data augmentation, and creative content development. 
  • Champion responsible AI practices, ensuring models are fair, and unbiased, and mitigate potential risks. 
  • Collaborate with cross-functional teams (engineering, product, marketing) to integrate AI and Generative AI solutions seamlessly into our products and workflows. 
  • Lead and mentor junior data scientists, fostering a culture of continuous learning and innovation. 
  • Should have working experience on LLMs including prompt engineering, fine-tuning, and explainability
  • Woking knowledge in advance Retrieval Augmented Generation (RAG) and ability to apply on structured and unstructured data
  • Communicate complex technical concepts clearly and concisely to both technical and non-technical audiences.
  • Formulates and leads guided, multifaceted analytic studies against large volumes of data. 
  • Interprets and analyses data using exploratory mathematical and statistical techniques based on the scientific method.
  • Coordinates research and analytic activities utilizing various data points (unstructured and structured) and employs programming to clean, massage, and organize the data.
  • Leads all data experiments tasked by the Data Science Team. Coordinates with Data Engineers to build data environments providing data identified by Data Analysts, Data Integrators, Knowledge Managers, and Intel Analysts. 
  • Working knowledge of MLOps  and establishing the ML pipelines 
  • Works closely with all business units and engineering teams to develop strategy for long-term data platform architecture.
  • Working experience on production-grade Data Science problems and handling large amount of data 
  • Evaluate ML and LLM performance using appropriate metrics and benchmarks.
  • Ensure compliance with data privacy regulations and ethical AI practices, particularly in handling textual data.
  • Implement security measures to protect sensitive data.
  • Lead the data-driven decision-making process, from data collection and analysis to implementation and monitoring of solutions
  • Collaborate with cross-functional teams to understand business challenges and objectives, translating complex data into actionable insights.
  • Coordinate with different functional teams to implement models and monitor outcomes.
  • Experience in  PyMuPDF, Camelot, SpaCy, NLTK, TextBlob, PyTorch, email parser, pdf parser, Scikit-learn, XGBoost, LightGBM, Keras
  • Experience in PostgreSQL, MySQL
  • Experience in Linux, shell programming and MLflow
  • Experience in Data Quality tools like GX
  • Experience in Docker and Git
  • Experience in OCR techniques including Tesseract, Pytesseract
  • Experience in Reinforcement Learning

Required Experience

  • Bachelor Degree / Master’s degree in Data Science 
  • Minimum 6-7 years of experience with hands-on data science, NLP, computer vision, and/or machine learning projects/products in industry.
  • Deep knowledge of model development 
  • Demonstrated experience applying data science methods to real-world data problems
  • Decent understanding of downstream business and supply chain. Must have excellent communication and interpersonal skills and work effectively in cross-functional teams.
  • Conducting big data analysis, Data conditioning, Developing algorithms and Executing predictive analytics
  • Able to bring ideas from conceptualization to productionalization (putting models in production) using the right tools (e.g. mlflow, kubeflow, tensorlight, etc).
  • Experienced in Information Retrieval (Content Recommendation, Search Metrics,  search query, document classification, entity recognition, topic modeling, etc).
  • Proficiency in using the LLMs
  • Proficient with one or more programming languages (Java, C++, Python, R, etc.)
  • Demonstrated experience applying data science methods to real-world data problems
  • Experience utilizing visualization tools to take advantage of the growing volume of available information
  • Having very strong expertise in data collection, cleaning, preprocessing, and wrangling is a requirement.
  • Proficiency in visualization tools and packages; and in communicating data science topics to non-technical audiences is a requirement.
  • Experience in Oil & Gas, Finance and Logistics is a plus

Non-Technical/ Behavioral Competencies

● Must have worked with Middle East based clients in onsite/offshore delivery models.
● Should have very good verbal and written communication, technical articulation, listening and
presentation skills.
● Should have superior persuasive and negotiation skills.
● Should have demonstrated effective task prioritization, time management and internal/external
stakeholder management skills.
● Should be a quick learner, self-starter, go-getter and team player.
● Should have experience of working under stringent deadlines in a Matrix organization structure
● Should have demonstrated appreciable Organizational Citizenship Behavior (OCB) in past
organizations.

Apply for this position

Allowed Type(s): .pdf, .doc, .docx

By submitting, you consent to CLOUDSUFI processing your information in accordance with our Privacy Policy. We take your privacy seriously; opt out of email updates at any time.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.