Faculty of Science and Engineering

Back to List

COT300XE(計算基盤 / Computing technologies 300)
Multi-modal Information Processing

Shoji KURAKAKE

Class code etc
Faculty/Graduate school Faculty of Science and Engineering
Attached documents
Year 2024
Class code H6107
Previous Class code
Previous Class title
Term 秋学期授業/Fall
Day/Period 金4/Fri.4
Class Type
Campus 小金井
Classroom name 小西館‐W305
Grade 3年
Credit(s)
Notes
Open Program
Open Program (Notes)
Global Open Program
Interdepartmental class taking system for Academic Achievers
Interdepartmental class taking system for Academic Achievers (Notes)
Class taught by instructors with practical experience
SDGs CP
Urban Design CP
Diversity CP
Learning for the Future CP
Carbon Neutral CP
Chiyoda Campus Consortium
Category 応用情報工学科
学科専門科目

Show all

Hide All

Outline (in English)

Multimodal information processing is about technologies for prediction and classification using different modal data, such as image, audio and text. Students will learn single and multi modal data processing technologies. For image processing, convolutional neural network is introduced. For speech recognition, hidden Markova model, RNN and LSTM are explained. Then applications of Large Language Model: LLM for multimodal tasks are reviewed.
Student will have opportunities to use MATLAB code provided by the lecturer and deepen the level of understanding for technologies learned through the course.
[Learning activities outside of classroom]
The review and the preparation of each lesson will take 4 hours. The way to use MATLAB should be learnt by students themselves. The help form the staff at the software center for the setting related things is available.
[Grading Criteria /Policy]
Grade is determined 40% by the submission of the assignment for each lesson and 60% by the evaluation of reports.

Default language used in class

日本語 / Japanese