GCP Data Engineer

Company: PamTen, Inc.
Location: Dearborn
Posted on: June 21, 2022

Job Description:

Job Description Primary skill BigQuery, SQL, DataFlow/Apache Beam - core Java Primary skill Worked in both Batch and streaming pipeline with Pub/sub, Kafka & using APIs Secondary skill Apache Airflow (composer)- Python Secondary skill Experience in ETL/ELT/CDC tools like DataStage, Informatica, Talend etc preferred Preferred Certification Google Data Engineer professional Design and build production data engineering solutions to deliver data pipeline patterns using following Google Cloud Platform (GCP) services: + In-depth understanding of Google's product technology and underlying architectures + BigQuery - Warehouse/ data marts - Through understanding of Big Query internals to write efficient queries for ELT needs, creation of views/materialized views, creation of reusable store procedures etc. + DataFlow (Apache Beam) - reusable Flex templates/ data processing frameworks using Java for both batch and stream needs. + Pub/Sub, Kafka, Confluent Kafka - Real time streaming of database changes or events. + Experience of designing, building, and deploying production-level data pipelines using Kafka; Strong experience working on Event Driven Architecture + Strong knowledge of the Kafka Connect framework, with experience using several connector types HTTP REST proxy, JMS, File, SFTP, JDBC etc. + Experience in handling huge volumes of streaming messages from Kafka + Cloud Composer (Apache Airflow) - to build, monitor and orchestrating the pipeline + Knowledge on BigTable + Cloud SQL, Compute Engine, Cloud Function, Cloud Run and App Engine, Cloud Storage + Experience with open-source distributed storage and processing utilities in the Apache Hadoop family. + Extensive knowledge on processing various file formats orc, Avro, csv, json, xml etc. + Knowledge/experience in any ETL tools like DataStage/Informatica - Ability to understand existing on-premises ETL workflows and redesign them in GCP. + Experience and expertise on Terraform to deploy the GCP's in CI/CD. + Knowledge/ Experience on connecting to on-prem API's from google cloud.

