8

Python实现聚类算法的图像化处理

 3 years ago
source link: https://arminli.com/python-cluster/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Python实现聚类算法的图像化处理

April 07, 2016

聚类算法参考这篇文章,本文是根据聚类算法得出的数据来绘制图像。

首先要对数据处理一下,在 DBSCAN 的算法中,我最后输出的 clusterID 不是连续的,为了方便做图我把所有点的 clusterID 从 0 开始按顺序排好,这段的代码是:

/*
deal.cpp


input: out.txt(x, y, clusterID)
 
9.000000 1.000000 5
10.000000 10.000000 16
2.100000 7.100000 1
1.100000 1.100000 1
1.100000 2.100000 1
1.100000 3.100000 1
1.100000 4.100000 1
8.000000 8.000000 6
9.000000 8.000000 6
1.100000 5.100000 1
2.100000 1.100000 1
2.100000 2.100000 1
2.100000 3.100000 1
2.100000 4.100000 1
2.100000 5.100000 1
2.100000 6.100000 1
8.000000 9.000000 6


output: pic.txt

2.1 7.1 0
1.1 1.1 0
1.1 2.1 0
1.1 3.1 0
1.1 4.1 0
2.1 6.1 0
2.1 5.1 0
1.1 5.1 0
2.1 1.1 0
2.1 2.1 0
2.1 3.1 0
2.1 4.1 0
9 1 1
8 9 2
9 8 2
8 8 2
10 10 3

*/
#include<cstring>
#include<cstdio>
#include<cstring>
#include<cmath>
#include<vector>
#include<string>
#include<iostream>
#include<algorithm>
using namespace std;

struct point{
    double a, b;
    int c;
}p[200000];
bool cmp(point x, point y){
    return x.c < y.c;
}
int main(){
    freopen("out.txt", "r", stdin);

    int id = 0;
    cout << id << endl;
    while(~scanf("%lf %lf %d", &p[id].a, &p[id].b, &p[id].c)){
        id++;
    }
    sort(p, p+id, cmp);
    int flag = -1, pre = -1;
    freopen("pic.txt", "w", stdout);

    for(int i = 0; i < id; i++){
        cout << p[i].a << " " << p[i].b << " ";
        if(p[i].c != pre){
            cout << ++flag << endl;
            pre = p[i].c;
        }else{
            cout << flag << endl;
        }
    }
    return 0;
}

然后用 python 做出图像就可以。

# coding=utf-8
import os
import sys
import matplotlib.pyplot as plt

# 支持8种不同颜色的点(0-7)
color_list = ['b', 'c', 'g', 'k', 'm', 'r', 'w', 'y']


def read_data(filename, xmax=11.0, ymax=11.0):
    try:
        with open(filename) as f:
            for row in f.readlines():
                x, y, n = row.split(' ')
                c = color_list[int(n)]
                draw_axes(xmax, ymax, x, y, c)

    except FileNotFoundError as e:
        print('No such file: ', e)
        sys.exit(-1)


def draw_axes(xmax, ymax, x, y, color):
    plt.axis((0, float(xmax), 0, float(ymax)))
    plt.scatter(x, y, c=color)


if __name__ == '__main__':
    filename = input('Please enter filename:')
    xmax = input('Please input xmax:')
    ymax = input('Please input ymax:')

    filename = os.getcwd() + '\\' + filename
    read_data(filename, xmax, ymax)

    plt.show()

做图的代码设置了 8 种颜色,刚才突然想到如果所需颜色很多的话,可以对 clusterID 取模来配色。这样就保证每个簇一个颜色但是会有重复的。如果这样处理的话就不需要 deal.cpp 来处理数据了。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK