DataFox Technical Data Analyst (Web Crawling) full time

Oracle HQ: Makati, Metro Manila, Philippines Remote job Sep 20

Oracle is looking for a technical analyst with experience and a deep passion for identifying patterns, anomalies, and hidden gems in business data. 

Oracle DataFox is the real-time source of truth on over 7.5 million companies around the world. Business professionals rely on DataFox to make informed business decisions and get insights into their markets, buyers, and suppliers. DataFox uses machine learning (ML) and natural language processing (NLP) to mine the public internet, news sources, and social sources for firmographics and signals about companies. DataFox presents the opportunity to build new products based on a wealth of relevant labeled data related to companies. DataFox has an amazing team of data labelers with years of experience making decisions about corporate hierarchies, business identities, and signal identification based on internet research. Every day, tens of thousands of events in the business world are processed, analyzed, and incorporated into DataFox’s growing knowledge base of companies.

Data quality is essential to DataFox’s products. As a technical data analyst, you will work on projects that will significantly impact the company. Example projects include:

·      automate real-time company updates through large-scale web crawling

·      integrate new data sources to increase DataFox company coverage and fill-rate

·      explore and identify potential new sources for company firmographics

·      identify innovative ways to improve data precision and validate data quality

Main responsibilities:

·      Explore and evaluate new data sources

·      Participate in large-scale web crawling projects

·      Implement custom codes, searches, and reports

·      Deliver proof-of-concepts

·      Provide current and future state flows, testing approach, and impact analysis

·      Test and validate code against requirements

·      Collaborate with team members and process owners to deliver great products and tools

·      Ensure all solutions achieve the defined business objectives and success metrics

·      Provide analytical support for implemented projects


Minimum Qualifications:

·      3 or more years of product development or technical services experience

  • Familiarity with general programming languages, particularly Python and corresponding environments, design patterns, and Python libraries such as Pandas, NumPy, Matplotlib, Seaborn, Beautiful Soup, etc.
  • Familiarity with JavaScript/nodeJS
  • Familiarity with Jupyter Notebook, Xpath, and other web scraping technologies
  • Experience with SQL and noSQL database
  • General understanding of API
  • Adeptness in data collection, querying data using API calls, internal database query
  • Understand the concept of data exploration and data preprocessing, including data cleaning, data normalization, features extraction, instance selection, etc.
  • Ability to tell a compelling story with data using data visualization techniques
  • Ability to make recommendations and decisions independently and make convincing arguments for the direction of the products

·      Ability to prioritize tasks, identify and mitigate risks, and manage time effectively

·      Strong communication skills

·      Computer Science or similar degree

·      Experience building web-based applications is preferred

This is a great opportunity within Oracle and within a really talented, global development community.

Full-time (40 hrs/wk)
Experience levels:
Intermediate (3 - 5 yrs)
Negotiable rate