CoreCuratedBDA
BDA IA2 Resources
Curated IA2 notes organized in a clean sequence for quick revision.
How to use this page: Follow the resources in order. Each page is cleaned and consolidated for exam-focused study.
Flajolet-Martin Algorithm
Approximate distinct-count estimation in streams using trailing-zero logic and hashing.
Distributed Systems, PageRank, and CPM
Cassandra, ZooKeeper, HBase, Kafka, PageRank iterations, and clique-based community detection.
Bloom Filter and Hashing
Membership testing, false positives, and practical hashing/modulo concepts used in big-data pipelines.
Distance and Similarity Measures
Euclidean, Manhattan, Cosine, and Jaccard measures with worked examples and interpretation guidance.
PCY Algorithm
Frequent pair mining with hash buckets, bitmap pruning, and final candidate verification.