Programa del Curso
Principios de la computación distribuida
- Apache Spark
- Hadoop
Principios de serialización de datos
- Cómo se pasa el objeto de datos por la red
- Serialización de objetos
- Enfoques de serialización
- Ahorro
- Buffers de protocolo
- Apache Avro
- estructura de datos
- tamaño, velocidad, características de formato
- almacenamiento persistente de datos
- integración con lenguajes dinámicos
- tipeo dinámico
- esquemas
- datos sin etiquetar
- gestión del cambio
Serialización de datos y computación distribuida
- Avro como un subproyecto de Hadoop
- Serialización de Java
- Serialización Hadoop
- Serialización Avro
Usando Avro con
- Colmena (AvroSerDe)
- Cerdo (AvroStorage)
Portar marcos de RPC existentes
Requerimientos
Una familiaridad general con la informática distribuida
Testimonios (6)
I thought he did a great job of tailoring the experience to the audience. This class is mostly designed to cover data analysis with HIVE, but me and my co-worker are doing HIVE administration with no real data analytics responsibilities.
ian reif - Franchise Tax Board
Curso - Data Analysis with Hive/HiveQL
Trainer's preparation & organization, and quality of materials provided on github.
Mateusz Rek - MicroStrategy Poland Sp. z o.o.
Curso - Impala for Business Intelligence
Many hands-on sessions.
Jacek Pieczątka
Curso - Administrator Training for Apache Hadoop
The VM I liked very much The Teacher was very knowledgeable regarding the topic as well as other topics, he was very nice and friendly I liked the facility in Dubai.
Safar Alqahtani - Elm Information Security
Curso - Big Data Analytics in Health
The fact that all the data and software was ready to use on an already prepared VM, provided by the trainer in external disks.
vyzVoice
Curso - Hadoop for Developers and Administrators
practical things of doing, also theory was served good by Ajay