Optional group:
comune Orientamento unico II ANNO -QUATTRO A SCELTA TRA - (show)
|
24
|
|
|
|
|
|
|
|
20802125 -
BIG DATA
(objectives)
The goal of the course is to illustrate the modern solutions to the management of big data, very large repositories of de-structured data. Starting from the requirements of modern database applications, the course will illustrate the hardware and software architectures that have been recently proposed for the management and analysis of big data. The topics addressed in the course will include: cluster architectures, map-reduce paradigm, cloud computing, NoSQL systems, tools and languages for data analysis. Both theoretical and practical aspects will be addressed and the discussed technologies will be experimented during practical classes and through the assignment of projects.
-
TORLONE RICCARDO
( syllabus)
- Infrastructures and programming paradigms for big data - The Hadoop Ecosystem - Cloud computing - Big data processing (MapReduce, Hive, Spark) - NoSQL systems - Big data analytics - Data lakes - Systems and applications - Business seminars
( reference books)
Martin J. Fowler, PramodkumarJ. Sadalage. "NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence". Teacher slides (available on the Web side of the course)
|
6
|
ING-INF/05
|
54
|
-
|
-
|
-
|
Core compulsory activities
|
ITA |
20810006 -
ADVANCED TOPICS IN COMPUTER SCIENCE
(objectives)
The goal of the course is to present models, methods and systems related to the latest advances in the field of information technology able to meet the requirements of modern applications. The course is taught in English by foreign professors of high qualification
-
TORLONE RICCARDO
( syllabus)
The program is defined at the beginning of the course on the basis of the invited foreign professors. The website of the course is always kept up to date with the latest information.
( reference books)
Teaching materials and texts are chosen by foreign professors.
|
6
|
ING-INF/05
|
42
|
-
|
-
|
-
|
Core compulsory activities
|
ITA |
20810087 -
MACHINE LEARNING
(objectives)
Enable students to deepen the main Machine Learning models and methods, such as Regression, Classification, Clustering, Deep Learning, and use them as tools for the development of innovative technologies.
-
Derived from
20810266 Machine Learning in Ingegneria informatica LM-32 MICARELLI ALESSANDRO, GASPARETTI FABIO
( syllabus)
1. Regression Review of Linear Regression Assessment and Overfitting in the Regression Feature Selection and Lasso
2. Classification Review of Logistic Regression for classification Overfitting in the Classification Boosting. AdaBoost algorithm Support Vector Machine (Large Margin Classification, Kernel I, Kernel II) Naïve Bayes
3. Clustering and Retrieval Algorithm K-NN Algorithm K-Means Expectation Maximization Applications to Information Retrieval
4. Dimensionality Reduction Data compression and visualization Principal Component Analysis (PCA) Choice of the number of principal components Applications to Recommender Systems
5. Deep Learning Deep Forward Networks Regularization for Deep Learning Convolutional Networks Various applications
6. Case studies and projects Several case studies will be exposed and projects will be proposed to apply the notions learned on various domains of interest.
( reference books)
Lecture notes by the professor.
|
6
|
ING-INF/05
|
54
|
-
|
-
|
-
|
Core compulsory activities
|
ITA |
20802126 -
INFORMATION VISUALIZATION
(objectives)
The large amount of relational data electronically available makes its exploration through visual interfaces an interesting research domain and a promising area for the development of new software products. New visual tools appear each day for exploring social networks, databases, computer networks, semantic web networks, security data, etc. Recently, the widespread adoption of tablets and smartphones all the more increases the need for innovative visual interfaces that are both intuitive and effective. The goal of this course is exactly that of introducing the participants to the problems and the solutions in the field of the visual exploration of abstract data, with a particular emphasis on the adopted graphic metaphors and on the algorithmic methods and models used. The knowledge of the participants about algorithm engineering and network optimization problems are deepened. Such a knowledge is applied to different kinds of visualization problems with a strong practical attitude.
-
PATRIGNANI MAURIZIO
( syllabus)
Data and Visualization: Data overloading. Comparison of Scientific Visualization and Information Visualization. Structured and Unstructured data. Data transformation. Data tables.
Visual Perception: Our vision’s principles and limitations. Peripheral and central view. The perception of color.
Cognitive Issues and User Tasks: Perception abilities. Weber's law. Stevens' power law. Gestalt laws. A two stage model for visual perception. Task taxonomies.
Infovis on the Web - SVG and D3.js: Basic ingredients of Web data visualization. JavaScript crash course. Raster and vector graphics. Overview of JavaScript libraries. Focus on D3.js.
Multivariate Data Representations: Combined views. Icons or glyphs. Alternative coordinate systems.
Visualization in Computer Networks: Visual analysis in the computer network domain. Motivations. Taxonomies. Real-world examples and use cases. Open questions.
Design Methods and Evaluation: Design methodologies and design choices. Design evaluation (goals, difficulties, practices, guidelines).
Visualization of Time Series Data: Definition of time series and temporal data. Visualization of time series (single dependent variable, multiple dependent variables). Case studies.
Interaction: Classification of interaction mechanisms, goals, and timings. Examples of interaction strategies.
Introduction to Graph Drawing: Graph Drawing conventions and aesthetics. The divide an conquer approach for testing planarity of a graph.
Node-link Representations of Trees: Representing trees within the node-link paradigm. Layered drawings of trees. Hv-drawings of trees. Limitations of node-link representations.
Space-Filling Visualizations of Trees: Algorithms and systems for the representation of trees using the space-filling strategy. Treemaps. 3D Space-filling approaches.
Representations of Graphs and Networks with the Force-Directed Approach: The force-directed paradigm. The barycenter method. Spring embedders. Scalability and flexibility of the force-directed paradigm. Fruchterman-Reingold and Barnes–Hut algorithms. Simulating graph theoretic distances. Magnetic fields. Generic energy functions. Handling drawing constraints.
Representations of Hierarchical Data: Algorithms for the representation of layered networks. The Sugiyama approach. Step 1: Cycle removal. Step 2: Level Assignment. Step 3: Crossing Reduction. Step 4: X-Coordinate Assignment
Orthogonal Drawings: Computing orthogonal drawings via Network Flows. The Topology-Shape-Metric approach. Extension to graphs of arbitrary degree. Representations of orthogonal drawings obtained from visibility representations and by incremental approaches.
Visualizing Large Graphs: Strategies for the visualization of massive amount of data providing both overview and details. Alternate between views. Combine different views. Filtering and clustering principles. Three-dimensional and two-dimensional representations of clustered graphs. Hybrid representations.
Tools and Libraries for Drawing Graphs: Tools and Libraries for drawing graphs. Programming languages, input and output formats, and interaction. Some practical example.
Architectures for Scalable Information Visualization: Computational and memory scalabity. Visualization architectures. Strategies for visualizing massive amounts of data.
( reference books)
Slides provided by the teacher and downloadable day by day from the course website: http://www.dia.uniroma3.it/~infovis/ In order to download the slides a userid-password pair is necessary (ask the teacher at maurizio.patrignani@uniroma3.it)
|
6
|
ING-INF/05
|
54
|
-
|
-
|
-
|
Core compulsory activities
|
ITA |
20810205 -
Digital entrepreneurship
(objectives)
Provide students with technical and methodological skills necessary to conceive, develop and implement a digital business project. The course will be divided into three parts. The first part aims to explain the reasons behind the success of digital companies (especially, but not only, startups) and digital innovation dynamics. The second part offers students the technical and methodological tools for the realization of a digital business project. The third part consists in the realization of a project and is characterized by a strongly experimental approach.
-
MERIALDO PAOLO
( syllabus)
Part 1 (1CFU) The success of digital enterprises • From the invetion of the microprocessor to cloud computing • Digital Enterpreses Business Models • Life cycle of a digital enterprise Part 2 (2CFU) Design, build, improve a digital product • Idea, team, fundings • Lean Canvas • User-centered design (UCD) and minimum viable product (MVP) • Investors Part 3 (3CFU) Teamwork. Students can choose to join the dock3 training program or to develop their own idea by themselves.
( reference books)
Business Model Generation: A Handbook for Visionaries, Game Changers, and Challengers by Alexander Osterwalder , Yves Pigneur
The Four Steps to the Epiphany: Successful Strategies for Products that Win (English Edition) by Steve Blank
The Lean Startup: How Today's Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses by Eric Ries
|
6
|
ING-INF/05
|
54
|
-
|
-
|
-
|
Core compulsory activities
|
ITA |
20810211 -
Algorithms for big data
(objectives)
In many application contexts huge volumes of data are produced which are used in the economic-financial, political, social and even institutional fields. Often the data is stored in huge distributed clouds and is sometimes generated according to a continuous flow, so large as to make complete storage unfeasible. In many cases the data pertains to entities in close relationship with each other and gives rise to massive networks of connections. Familiar examples for such networks are biological and social networks, distribution networks, and the Web graph. Furthermore, the fact that the data is stored in systems managed by third parties poses integrity problems, which have not been considered in the classical IT literature in terms of both their type and scale.
This scenario poses unprecedented algorithmic challenges, which are being considered by a vast audience of researchers. In the last decade, this effort has produced many innovations on both the methodological and technological level. This course aims at transferring to the students some of the most important methodological tools originated from the research on Big Data algorithms. These methodological tools are presented within challenging application contexts.
-
DI BATTISTA GIUSEPPE
( syllabus)
1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets - Number of distinct elements 2) Dimensionality reduction Johnson–Lindenstrauss lemma Embedding metric spaces with low distortion 3) Algorithms and data structures for quantitative features analysis - orthogonal range searching (kd-trees and range trees) - nearest neighbour search, k-nearest neighbour search - fractional cascading and simplex range search 4) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 5) NoSQL internals: Distributed Hash Tables, chord, consistent hashing 6) Scalable security: integrity of big data sets in the cloud, consistency and scalability issues with authenticated data structures, pipelining, blockchain scalability trilemma.
( reference books)
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
-
PATRIGNANI MAURIZIO
( syllabus)
1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets - Number of distinct elements 2) Dimensionality reduction - Johnson–Lindenstrauss lemma - Embedding metric spaces with low distortion 3) Algorithms and data structures for quantitative features analysis - orthogonal range searching (kd-trees and range trees) - nearest neighbour search, k-nearest neighbour search - fractional cascading and simplex range search 4) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 5) NoSQL internals: Distributed Hash Tables, chord, consistent hashing 6) Scalable security: integrity of big data sets in the cloud, consistency and scalability issues with authenticated data structures, pipelining, blockchain scalability trilemma.
( reference books)
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
-
FRATI FABRIZIO
( syllabus)
1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets - Number of distinct elements 2) Dimensionality reduction - Johnson–Lindenstrauss lemma - Embedding metric spaces with low distortion 3) Algorithms and data structures for quantitative features analysis - orthogonal range searching (kd-trees and range trees) - nearest neighbour search, k-nearest neighbour search - fractional cascading and simplex range search 4) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 5) NoSQL internals: Distributed Hash Tables, chord, consistent hashing 6) Scalable security: integrity of big data sets in the cloud, consistency and scalability issues with authenticated data structures, pipelining, blockchain scalability trilemma.
( reference books)
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
-
PIZZONIA MAURIZIO
( syllabus)
1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets - Number of distinct elements 2) Dimensionality reduction - Johnson–Lindenstrauss lemma - Embedding metric spaces with low distortion 3) Algorithms and data structures for quantitative features analysis - orthogonal range searching (kd-trees and range trees) - nearest neighbour search, k-nearest neighbour search - fractional cascading and simplex range search 4) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 5) NoSQL internals: Distributed Hash Tables, chord, consistent hashing 6) Scalable security: integrity of big data sets in the cloud, consistency and scalability issues with authenticated data structures, pipelining, blockchain scalability trilemma.
( reference books)
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
-
DA LOZZO GIORDANO
( syllabus)
1) Algorithms for data streams - Approximate counting - Majority problems - Sampling and reservoir sampling - Bloom filters - Frequent itemsets - Number of distinct elements 2) Dimensionality reduction -Johnson–Lindenstrauss lemma Embedding metric spaces with low distortion 3) Algorithms and data structures for quantitative features analysis - orthogonal range searching (kd-trees and range trees) - nearest neighbour search, k-nearest neighbour search - fractional cascading and simplex range search 4) Algorithms for the decomposition of complex networks - Decomposition into k-connected components - Decomposition into k-cores, maximal cliques, maximal k-plexes 5) NoSQL internals: Distributed Hash Tables, chord, consistent hashing 6) Scalable security: integrity of big data sets in the cloud, consistency and scalability issues with authenticated data structures, pipelining, blockchain scalability trilemma.
( reference books)
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Cambridge University Press http://www.mmds.org/
|
6
|
ING-INF/05
|
54
|
-
|
-
|
-
|
Core compulsory activities
|
ITA |
|