} -->

Unsupervised Learning Explained: A Beginner's Guide for AWS ML Associate Exam

Unsupervised learning in A.I machine learning? what that means in AWS Machine learning thats what we will see in the post. Do check this link if you hadn't seen what is supervised learning which will help you understand things easily to compare. Let's begin..

Also Read : What is Artificial Intelligence (A.I), GenAI and its future?

What is unsupervised machine learning?

Its just the opposite of Supervised learning. In Supervised learning we had seen that if we provide label to the image like this is car, this is AMD PC, this is Qualcomm Chip and how machine remembers that, same concept. Meaning, there wont be any teacher for it to teach.

Let's see with an example: Imagine you walk into a room full of random items (you can even imagine your home(: books, shoes, snacks, gadgets—no labels, no shelves, just a big mess. Now suppose someone says, “Can you try to organize this?” Even though you don’t know exactly what each item is for, you can still group them based on how they look or seem to relate to each other. You might put all the things that look edible in one pile, electronic stuff in another, and so on.

That’s what unsupervised machine learning does. The computer looks at a bunch of data without any instructions—no labels like “this is a cat” or “this is spam”—and tries to make sense of it by finding patterns. It’s like letting the computer be curious and figure things out on its own.

Instead of telling it, "Hey, these are emails that are junk," you just give it all your emails, and the system starts noticing: "Hmm, these ones have a lot of ‘free money’ and weird links... maybe they belong together."

It’s a clever way computers learn to organize, detect patterns, or discover hidden structures in messy piles of information—kind of like how humans sort socks without having to read instructions.

That's okay, how it will do for image? Is there any pattern it understands? 

If that was your question,  the short answer is yes!

  • Computers break images down into tiny squares called pixels, each with color values.

Computers break images down into tiny squares called pixels, each with color values. is it like 1080P, 4K we see in phone/youtube videos?

Yes, you’ve got the right!
  • 1080p means the image is 1920 pixels wide and 1080 pixels tall.

  • 4K usually means about 3840 pixels wide and 2160 pixels tall.

So yes—when computers analyze images, they’re working with these tiny colored squares, or pixels, one by one. 

The more pixels there are, the more detailed the image is, and the more info a computer can gather from it.

So, incase it wants to understand the image is dog?

For unsupervised learning, it won't be told 'this is a dog.' Instead, it will still break down images into pixels. It might then look for patterns: 'Hmm, these pixels often form shapes like pointy ears or a round ears. These images often have similar fur textures.' If it finds many images with these similar pixel patterns, it might group them into a cluster. Later, it might look at that cluster and say, 'Ah, this cluster seems to be all the dogs!' The computer found the pattern on its own, without a label."

This is called Unsupervised learning!

The hard fact that will be difficult to digest is, computer/AI came a long way. It can now create videos with this pattern in seconds when you give keywords/prompt you gave! Im just referring you AI generated videos, think how far we came! 

There are some unsupervised techniques like we had in Supervised learning.

Let's see one by one in short..

What is clustering in machine learning?

Simple, when a computer groups things that look alike (even though it doesn’t know what those things are) its called clustering a perfect Example: on your family photo, everyone will look alike,  if system classifies and organize as you all alook alike then it is called Clustering. .

Second is Anomaly Detection / Outlier Detection:

What is this Anomaly detection?

It's when a system says, "Hmm, this one looks weird compared to the rest." Like finding one red sock in a group of white ones, thats called Anomaly detection! This is super useful for fraud detection or monitoring sensor data for unusual behavior. AWS has a specialized built-in algorithm in SageMaker called Random Cut Forest (RCF) specifically for this it seems. Dont'  break your head about that for now, we will start reading about that once we covered all basics!

Association Rule Learning :

Its as the name says, computers finding Association between two things. Example: For making Tea, you will have Tea leaves or powder + Water combination. This is called Association rule learning. i.e., Discover interesting relationships between things that often appear together. This is incredibly useful for speeding up training times and improving the performance of other machine learning models. 

There is one another important concept i found when i was referring different sites like official documentations lot of people were suggesting its importance , so with that we will conclude this post.

That concept is called Dimensional Reduction. So, 

What is this so called Dimensionality Reduction in Machine learning?

Nothing big, if system reduces unwanted information it needs whenever processing large data/pixel then it is called Dimenionality reduction.

Lets go back to our dog image example.. We had seen it will first detect edge pixels, then fur, the line,etc right? Now, if the computer wants to figure out, “Is there a dog here?” but going through every single dot is slow and messy process right? so, it will eliminate unnecessary information it doesnt need.So what does it do? It will start trimming down the photo—keeping just the important parts, like the shape of the ear, the texture of the fur or the round eyes or nose. It ignores things that aren’t helpful—like background grass or clouds—so it doesn’t waste time. This is called dimensionality reduction. It's like simplifying a drawing to just the lines that matter.

Another example if its like excel which has 100 columns about customer, if it needs only few columns for carrying out the activity and removes unnecessary data points, then it is called dimensonality reduction.

How all this is being done? They are done through Algorithms! While the math behind these is complex, the good news is that AWS SageMaker provides these as built-in algorithms, making them easy to use without needing to understand every mathematical detail. For the AWS exam, it's more important to know which algorithm to use for which task 

Anyway, im still learning them too like you! I just want to keep this post beginner friendly so, will update about them, in subsequent posts.

That's all for now, stay tuned!