JOB PURPOSE STATEMENT
The Data Engineering team is responsible for documenting data models, architecting distributed systems, creating reliable data pipelines, combining data sources, architecting data stores, and collaborating with the data science teams to build the right solutions for them.
They do this by using the Open Source Big Data platforms such as Apache NIFI, Kafka, Hadoop, Apache Spark, Apache Hive, HBase, Druid and the Java programming language, while picking the right tool for each purpose.
The growth of every product relies heavily on data, such as for scoring and for studying product behavior that may be used for improvement, and it is the role of the data engineer to build a fast and horizontally scalable architectures using modern tools that are not the traditional Business Intelligence systems as we know them.
KEY ACCOUNTABILITIES (DUTIES AND RESPONSIBILITIES)
Documenting Data Models (10%)
The role will be responsible for documenting the entire journey that data elements take end-to-end, from the data sources to the all the data stores, including all the transformations in between, and maintaining those documents up to date with every change.
Architecting Distributed Systems (10%)
Modern data engineering platforms are distributed systems. The data engineer designs the right architecture for each solution, while utilizing best-of-breed Open Source tools in the big data ecosystem because there is no one solution that does everything; the tools are specialized and are made lean and fit for purpose. The architecture should be one that can process any data, Any Time, Any Where, Any Workload.
Combining Data Sources (10%)
Pulling data from different sources, which could be structured, semi-structured or unstructured data using tools such as Apache NIFI, and taking the data through a journey that will create a final state that is useful to the data consumers.
These sources can be REST, JDBC, Twitter, JMS, Images, PDF, MS Word and put the data into a staging environments such as Kafka topics for onward processing to ensure expectations are being met.
Developing Data Pipelines (15%)
Creating data pipelines that will transform data using tools such as Apache Spark and the Java programming language.
The pipelines may apply processing such as machine learning, aggregation, iterative computation, and so on.
Architecting Data Stores (15%)
Designing and creating data stores using big data platforms such as Hadoop, and the NoSQL databases such as HBase.
Data Query and Analysis (35%)
Utilizing tools such as Apache Hive to analyze data in the data stores to generate business insights.
Team Leadership (5%)
Providing team leadership to the data engineers
JOB SPECIFICATIONS
Academic:
A Bachelor’s Degree in Computer Science, Information Technology or related field.
Professional Qualifications;
Certification and experience implementing best practice frameworks e.g. ITIL, PRINCE2, preferred.
Desired work experience:
Minimum 5 years’ experience developing object-oriented applications using the Java programming language.
Minimum 5 years' experience working with relational databases.
Minimum 5 years' experience working with the Linux operating system.
JOB COMPETENCIES
Technical Competencies
Ability to architect distributed systems, create data pipelines, combine data sources, architecting data stores, collaborating with the data science teams and the business users to create the right data solutions for them.
Experience with Open Source Big Data Platforms and tools (Hadoop, Kafka, Apache NIFI, Apache Spark, Apache Hive, NoSQL databases) and ODI.
Experience working with Data Warehouses.
Experience with DevOps, Agile working and CICD.
Familiarity with complex systems integrations using SOA tools (Oracle Weblogic/ESB/SOA).
Familiarity with industry standard formats and protocols (JMS, SOAP, XML/XPath/XQuery, REST and JSON) and data sources.
Organizational structure and design, organizational strategy, understand how data flows in the organisations, understand the meaning of data, information management domain knowledge, software acquisition, outsourcing management (RFP, RFI and RFQ), compliance (such as the Central Bank of Kenya Act).
Excellent analytical, problem solving and reporting skills
A good knowledge of the systems and processes within Financial Services industry
Behavioural Competencies
The ideal candidate is passionate about innovation.
Loves technology and possess both a deep and broad understanding of the technology market and cutting-edge technology trends.
Continuously listening to our stakeholder’s feedback, and coming up with new architectures and enhancing existing ones to leverage these cutting-edge technologies.
Excellent planning and organizational skills with ability to breakdown complex items to actionable elements.
Decisive and solution focused. Possess strong analytical skills with the ability to collect, organize and analyze significant amount of information with attention to detail and accuracy.
Relate easily and naturally with executives, business managers, technical teams and customers. Has excellent listening skills and understands the desires and challenges of all our leaders and customers.
Able to change plans, methods, opinions or goals in light of new information, with the readiness to act on opportunities. Highly effective in adapting to differing environments.
Capable of developing a sound understanding of the motives, needs and concerns of others and develop a deep understanding of their complex stakeholder network. Can anticipate the motives and expectations of others effectively.
Self-motivated and self-managing.