Generalized K-Means Clustering

This project generalizes the Spark MLLIB Batch K-Means (v1.1.0) clusterer and the Spark MLLIB Streaming K-Means (v1.2.0) clusterer. Most practical variants of K-means clustering are implemented or can be implemented with this package, including:

If you find a novel variant of k-means clustering that is provably superior in some manner, implement it using the package and send a pull request along with the paper analyzing the variant!

This code has been tested on data sets of tens of millions of points in a 700+ dimensional space using a variety of distance functions. Thanks to the excellent core Spark implementation, it rocks!

Name	Name	Last commit message	Last commit date
Latest commit derrickburns Setting version to 2.0.1-SNAPSHOT Jan 19, 2024 e1f21b1 · Jan 19, 2024 History 824 Commits
.bloop	.bloop	Removed release info since releases are not created yet.	Jan 15, 2024
.vscode	.vscode	Removed release info since releases are not created yet.	Jan 15, 2024
project	project	Added gitbook plugin	Jan 18, 2024
release-notes	release-notes	Updated release notes.	Mar 25, 2015
src	src	GITBOOK-13: No subject	Jan 19, 2024
target/scala-2.12/api	target/scala-2.12/api	Added link to gitbook	Jan 18, 2024
.travis.yml	.travis.yml	Added more unit tests. Added support for code coverall testing via sb…	Mar 6, 2015
LICENSE	LICENSE	Added LICENSE file.	Jan 20, 2015
README.md	README.md	Update README.md	Jan 18, 2024
SAMPLE_LOG.md	SAMPLE_LOG.md	Added access to RESEED algorithm. Added Sample Log output file.	Mar 6, 2015
build.sbt	build.sbt	Added link to gitbook	Jan 18, 2024
release.sbt.old	release.sbt.old	Removed release info since releases are not created yet.	Jan 15, 2024
scalastyle-config.xml	scalastyle-config.xml	Added scalastyle support.	Mar 30, 2015
sonatype.sbt	sonatype.sbt	Add sonatype info	Jan 18, 2024
version.sbt	version.sbt	Setting version to 2.0.1-SNAPSHOT	Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generalized K-Means Clustering

About

Releases

Packages

Contributors 4

Languages

License

derrickburns/generalized-kmeans-clustering

Folders and files

Latest commit

History

Repository files navigation

Generalized K-Means Clustering

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages