Large Language Models, Artificial Intelligence and Data Science
Syllabus: CS211 Section 301 Fall24
Info
3 Credits
Tue 2:10pm - 5:00pm
Room C305
Instructor Information
calvin_williamson@fitnyc.edu
office: B831 Science and Math
office hours: T 5-6, W 11-12, R 10-12
Description
This course provides an introduction to large language models and their capabilities in artificial intelligence and data science. Through theory and hands-on labs, students will gain literacy in model architecture, training techniques, applications in programming, content creation, and more. No prior programming experience is required to take this course. Prerequisite(s): Math Proficiency
Outcomes
- Explain the evolution, capabilities, and limitations of large language models.
- Analyze the core components of LLMs including model architecture, parameters, and training techniques.
- Assess strategies like prompt engineering and fine-tuning to optimize LLM performance.
- Explore mathematical foundations of vector databases and storage including semantic similarity
- Apply LLMs to natural language tasks like classification, summarization, and question answering.
- Utilize LLMs to assist with programming, data science, content creation, and other domains.
- Examine ethical implications of LLMs including bias, misinformation, and legal issues.
Course Materials
We will be using Google Colab, OpenAI ChatGPT, Anthropic Claude, Google Gemini, and other LLM tools for all work in this course. Since these are web-based applications there is NO OTHER SOFTWARE required for the course besides a web browser.
Topics
Introduction to Python for Artificial Intelligence
- Google Colab Notebook
- Using LLM as Coding Assistant
- Calculations
- Variables
- DataTypes
- Lists
- Dictionaries
- Functions
- Dataframes
- f-Strings
Fundamentals of LLMs, Generative AI
- Basic architecture of LLMs
- Foundational models, Number of Parameters
- Context size
- Importance of Pretraining vs Fine-tuning
- Transformers, Attention Mechanisms, Encoder-Decoder architectures
Introduction to Large Language Models (LLMs)
- LLM Examples (GPT, Claude, Gemini)
- Completions, APIs
- Prompting
- Prompt Chaining
- Roles and Personas
- Chain of thought
- Few-shot and zero-shot Learning
LLM Applications
- Text classification
- Translation
- Sentiment analysis
- Question Answering
- Text summarization
- Named entity recognition (NER)
LLM and Data (Retreival Augmented Generation)
- Question Answer over data (pdfs, websites)
- Searching and semantic similarity
- Text embeddings and vector databases
LLM Application Building Tools
- LangFlow
Evaluation
Your grade will come from these parts:
- Quizzes (70%)
- Problem Credits (24%)
- DataCamp Courses (6%)
Each of these parts is described in more detail below
Quizzes (70%)
Your quiz grade will come from 5 quizzes roughly covering 2 or 3 weeks material eachThis quizzes are 30-45 minutes each and are usually 5 or 6 questions each.These quizzes are with no notes, no internet, no phone, no software, no AI tools.Pen and paper and calculator only. They are some multiple choice, some short answer, some true false.
Problem Credits (2 or 3 per class) (24%)
Problem credits are credits you obtain for demonstrating you have completed assigned problems. Some of these will come from homework assignments that you show me at the beginning of the class, some of these will come from in class assignments that are done during class and you show as you complete them. You will earn 1 credit for each successful problem completion. You must be in attendance to earn problem credits.
DataCamp Courses (6%)
This course will use some material at datacamp.com, a website for learning data science. You will have free login to the site for 6 months beginning the first day of our course. You will be completeing 2 or 3 required courses in datacamp for topics related to our course. But you can also take any of the courses in datacamp for free, related to our course or not. There are many courses at all levels from beginner to more advanced about data science, programming, AI, etc.
There is NO FINAL EXAM.
AI Policy
All uses of chatbots are encouraged, and there is no restriction on their use. This is especially for topics about large language models (ChatGPT, Gemini, Claude, etc).