python - AgglomerativeClustering with disconnected connectivity constraint -


i have tried both on latest 0.16.1 version , on latest bleeding edge version of sklearn '0.17.dev0' , appears issue in both.

i use

sklearn.cluster.agglomerativeclustering(affinity='precomputed',connectivity=cmat,linkage='complete') 

where cmat connectivity matrix in there disconnected components. indicated source code, error message userwarning: number of connected components of connectivity matrix *>1. completing avoid stopping tree early.

however, reading source code see when completing connectivity matrix developers wondering whether clustering can take place without completing matrix:
""xxx: can without completing matrix?""

i interested in development. think sklearn planning fix , make possible clustering without completing matrix? has implemented themselves? gladly take advise on this!

for interested, figured out avoid problem. if use distance matrix input way do, , values between 0 , 1. add distance value of 1 between corresponding disconnected components. checked source code, , way completes connectivity matrix considering minimum distance between disconnected components. in way, adding highest possible distance of 1 (or other maximum distance) force connectivity matrix completed in more natural way rather merging disconnected components.


Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -