Teaching - Università Roma Tre

Algorithms for big data (objectives)

Code

20810211

Language

ITA

Type of certificate

Profit certificate

Credits

6

Scientific Disciplinary Sector Code

ING-INF/05

Contact Hours

54

Type of Activity

Core compulsory activities

Teacher	DI BATTISTA GIUSEPPE (syllabus) 1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets 2) Sub-linear algorithms - Diameter approximation - Property testing 3) Clustering 4) Algorithms and data structures for quantitative features analysis - 1d-,2d-,3d-range queries - Skyline (pareto frontier), near-neighbor search, voronoi diagrams 5) Dimensionality reduction 6) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 7) Distributed Hash Tables, Consistent Hashing 8) Integrity of large data sets, consistency in distributed systems, CAP/PACELC theorems and their impact on NoSQL databases (reference books) Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
Dates of beginning and end of teaching activities	From 01/03/2021 to 11/06/2021
Delivery mode	Traditional
Attendance	not mandatory
Evaluation methods	Oral exam A project evaluation

Teacher	PATRIGNANI MAURIZIO (syllabus) 1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets 2) Sub-linear algorithms - Diameter approximation - Property testing 3) Clustering 4) Algorithms and data structures for quantitative features analysis - 1d-,2d-,3d-range queries - Skyline (pareto frontier), near-neighbor search, voronoi diagrams 5) Dimensionality reduction 6) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 7) Distributed Hash Tables, Consistent Hashing 8) Integrity of large data sets, consistency in distributed systems, CAP/PACELC theorems and their impact on NoSQL databases (reference books) Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
Dates of beginning and end of teaching activities	From 01/03/2021 to 11/06/2021
Delivery mode	Traditional
Attendance	not mandatory
Evaluation methods	Oral exam A project evaluation

Teacher	PIZZONIA MAURIZIO (syllabus) 1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets 2) Sub-linear algorithms - Diameter approximation - Property testing 3) Clustering 4) Algorithms and data structures for quantitative features analysis - 1d-,2d-,3d-range queries - Skyline (pareto frontier), near-neighbor search, voronoi diagrams 5) Dimensionality reduction 6) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 7) Distributed Hash Tables, Consistent Hashing 8) Integrity of large data sets, consistency in distributed systems, CAP/PACELC theorems and their impact on NoSQL databases (reference books) Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
Dates of beginning and end of teaching activities	From 01/03/2021 to 11/06/2021
Delivery mode	Traditional
Attendance	not mandatory
Evaluation methods	Oral exam A project evaluation

Teacher	FRATI FABRIZIO (syllabus) 1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets 2) Sub-linear algorithms - Diameter approximation - Property testing 3) Clustering 4) Algorithms and data structures for quantitative features analysis - 1d-,2d-,3d-range queries - Skyline (pareto frontier), near-neighbor search, voronoi diagrams 5) Dimensionality reduction 6) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 7) Distributed Hash Tables, Consistent Hashing 8) Integrity of large data sets, consistency in distributed systems, CAP/PACELC theorems and their impact on NoSQL databases (reference books) Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
Dates of beginning and end of teaching activities	From 01/03/2021 to 11/06/2021
Delivery mode	Traditional
Attendance	not mandatory
Evaluation methods	Oral exam A project evaluation

Teacher	DA LOZZO GIORDANO (syllabus) 1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets - Number of distinct elements 2) Dimensionality reduction -Johnson–Lindenstrauss lemma Embedding metric spaces with low distortion 3) Algorithms and data structures for quantitative features analysis - orthogonal range searching (kd-trees and range trees) - nearest neighbour search, k-nearest neighbour search - fractional cascading and simplex range search 4) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 5) NoSQL internals: Distributed Hash Tables, chord, consistent hashing 6) Scalable security: integrity of big data sets in the cloud, consistency and scalability issues with authenticated data structures, pipelining, blockchain scalability trilemma. (reference books) Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
Dates of beginning and end of teaching activities	From 01/03/2021 to 11/06/2021
Delivery mode	Traditional
Attendance	not mandatory
Evaluation methods	Written test A project evaluation