Website www.home.kpmg.com
Eligibility Any Graduate
Experience Freshers
Location Bangalore
Job Role Big Data Engineer
JOB SUMMARY:
Company Profile:
We encourage our people to make a sustainable difference for clients, communities and firms, while simultaneously ensuring that everyone has access to the knowledge and skills they need to be the most relevant in the market. KPMG International’s People, Performance and Culture Manual aligns member firms to the four labor principles of the UNGC and informs KPMG firms’ local labor practices.
Job Description:
Role Summary1. Support client engagements focused on Big Data and Advanced Business Analytics in diverse domains such as Retail, Finance & Healthcare etc.
2. Build and maintain infrastructure to support data collection, ETL and analysis
3. Interface with databases (SQL, NoSQL, HDFS) to extract, transform and load data
4. Implement algorithms and software needed to perform analyses
5. Process data in large-scale environments, in Amazon EC2, Storm, Hadoop, Spark
6. Analyze data using R, Python, Java, and open source packages
Role Summary1. Support client engagements focused on Big Data and Advanced Business Analytics in diverse domains such as Retail, Finance & Healthcare etc.
2. Build and maintain infrastructure to support data collection, ETL and analysis
3. Interface with databases (SQL, NoSQL, HDFS) to extract, transform and load data
4. Implement algorithms and software needed to perform analyses
5. Process data in large-scale environments, in Amazon EC2, Storm, Hadoop, Spark
6. Analyze data using R, Python, Java, and open source packages
Candiate Profile:
Mandatory Skills:1. Bachelor or higher degree in computer science or related fields
2. Proficient in one or more of modern programming language such as Java, Python. Proficiency in Analytics Packages like R, SAS, Matlab.
3. Experience and ability to work in a Unix/Linux environment, and proficient in command-line scripting
4. Ability to implement, maintain, and troubleshoot big data infrastructure, such as distributed processing paradigms, stream processing(Storm,spark), search api(Solr) and databases, such as Hadoop,HBASE,HIVE,SQL etc.
5. Strong mathematical background with ability to understand algorithms and methods from a mathematical viewpoint and an intuitive viewpoint
6. Ability to break down complex problems, and develop strategies that prioritize key areas
7. Experience of processing large dataset in a cloud environment
8. Experience working with large datasets and problems
9. Experience in machine learning, natural language processing and/or information retrieval
10. Knowledgeable with search engines, spam detection, recommendation systems, and/or social networks
11. Strong data extraction and processing, using MapReduce, Pig, and/or Hive preferred
Mandatory Skills:1. Bachelor or higher degree in computer science or related fields
2. Proficient in one or more of modern programming language such as Java, Python. Proficiency in Analytics Packages like R, SAS, Matlab.
3. Experience and ability to work in a Unix/Linux environment, and proficient in command-line scripting
4. Ability to implement, maintain, and troubleshoot big data infrastructure, such as distributed processing paradigms, stream processing(Storm,spark), search api(Solr) and databases, such as Hadoop,HBASE,HIVE,SQL etc.
5. Strong mathematical background with ability to understand algorithms and methods from a mathematical viewpoint and an intuitive viewpoint
6. Ability to break down complex problems, and develop strategies that prioritize key areas
7. Experience of processing large dataset in a cloud environment
8. Experience working with large datasets and problems
9. Experience in machine learning, natural language processing and/or information retrieval
10. Knowledgeable with search engines, spam detection, recommendation systems, and/or social networks
11. Strong data extraction and processing, using MapReduce, Pig, and/or Hive preferred
Good to have Skills:
1. Analyze and model structured data using advanced statistical methods.
1. Analyze and model structured data using advanced statistical methods.