专栏首页PPV课数据科学社区工具 | 一张图,教你用25种可视化工具如何完成

工具 | 一张图,教你用25种可视化工具如何完成

散点图真是一个比较神奇的图形,正如它的名字一样,一堆纷乱如麻的圆点,看似无迹可寻却能显示出数据难以显示的内在逻辑关系。很多人称它“万表之王”,它在数据分析师手里已经演化成了一个强大的数据分析工具。

你一般会选择哪种工具来做数据可视化?Lisa Charlotte Rost从去年五月开始尝试了24种工具或语言来画一张气泡图,经过半年的学习实践发现没有完美的可视化工具,每个工具都有各自的优缺点,但是对于某些领域目的,还是有比较推荐的可视化工具。

以下红色的是软件,蓝色的是语言

越靠左越适合做数据分析,越靠右越适合做展示

越靠右越灵活

左侧是静态,右侧是互动

越往左越容易上手,越往上越灵活

这是一张工具选择推荐图,根据目的分类

左上是简单快捷的目的,左下是故事导向,右上是为了分享的分析,右侧是创新型图表,右下是分析型工具

在看完对工具的推荐后,有兴趣的可以看下这24种工具是如何实现气泡图的。

数据源统一如下,4个字段分别为国家,人均收入,寿命,人口总数,想要做的效果是一个气泡图,X轴为人均收入,Y轴为寿命,气泡大小为人口总数

工具1:Excel

工具2:Google Sheets

工具3:Adobe Illustrator

工具4:RAW by DensityDesign

工具5:Lyra

工具6:Tableau Public

工具7:Polestar

工具8:Quadrigram

工具9:Highcharts Cloud

工具10:Easychart

工具11:Plotly

工具12:NodeBox

工具13:R – native

#set working directorysetwd("Desktop")#read csvd = read.csv("data.csv", header=TRUE)#plot chart, set range for x-axis between 0 and 11symbols(log(d$income),d$health,circles=d$population,xlim = c(0,11))

工具14:R – ggplot2

#import librarylibrary(ggplot2)#set working directorysetwd("Desktop")#read csvd = read.csv("data.csv", header=TRUE)#plot chartggplot(d) +
  geom_point(aes(x=log(income),y=health,size=population)) +
  expand_limits(x=0)

工具15:R – ggvis

#import librarylibrary(ggvis)library(dplyr)#set working directorysetwd("Desktop")#read csvd = read.csv("data.csv", header=TRUE)#plot chartd %>%
  ggvis(~income, ~health) %>%
  layer_points(size= ~population,opacity:=0.6) %>%
  scale_numeric("x",trans = "log",expand=0)

工具16:Python - matplotlib

#import librariesimport numpy as npimport pandas as pdimport matplotlib.pyplot as plt#read datadata = pd.read_csv("data.csv")#plot chartplt.scatter(np.log(data['income']), data['health'], s=data['population']/1000000, c='black')plt.xlim(xmin=0) #set origin for x axis to zeroplt.show()

工具17:Python - Seaborn

#import librariesimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as sns#read datadata = pd.read_csv("data.csv")#plot chartg = sns.regplot('income', 'health', data=data, color='k',fit_reg=False)g.set_xscale('log')plt.show()

工具18:Python - Bokeh

#import librariesimport pandas as pdfrom bokeh.plotting import figure, show, output_file#read datadata = pd.read_csv("data.csv")#plot chartp = figure(x_axis_type="log")p.scatter(data['income'], data['health'], radius=data['population']/100000,
          fill_color='black', fill_alpha=0.6, line_color=None)#write as html file and open in browseroutput_file("scatterplot.html")show(p)

工具19:Processing

void setup() {size(1000,500); #sets size of the canvasbackground(255); #sets background colorscale(1, -1); #inverts y & x axistranslate(0, -height); #inverts y & x axis, step 2Table table = loadTable("data.csv", "header"); #loads csv

  for (TableRow row : table.rows()) { #for each rown in the csv, do:

    float health = row.getFloat("health");
    float income = row.getFloat("income");
    int population = row.getInt("population");
    #map the range of the column to the available height:
    float health_m = map(health,50,90,0,height);
    float income_log = log(income);
    float income_m = map(income_log,2.7, 5.13,0,width/4);
    float population_m =map(population,0,1376048943,1,140);

    ellipse(income_m,health_m,population_m,population_m); //draw the ellipse
  }}

工具20:D3.js

<!-- mostly followed this example:
http://bl.ocks.org/weiglemc/6185069 --><!DOCTYPE html><html><head>
  <style>

  circle {
    fill: black;
    opacity:0.7;
  }

  </style>
  <script type="text/javascript" src="D3.v3.min.js"></script></head><body>
  <script type="text/javascript">

  // load data
  var data = D3.csv("data.csv", function(error, data) {

    // change string (from CSV) into number format
    data.forEach(function(d) {
      d.health = +d.health;
      d.income = Math.log(+d.income);
      d.population = +d.population;
      console.log(d.population, Math.sqrt(d.population))
    });

  // set scales
  var x = D3.scale.linear()
    .domain([0, D3.max(data, function(d) {return d.income;})])
    .range([0, 1000]);

  var y = D3.scale.linear()
    .domain([D3.min(data, function(d) {return d.health;}),
      D3.max(data, function(d) {return d.health; })])
    .range([500, 0]);

  var size = D3.scale.linear()
    .domain([D3.min(data, function(d) {return d.population;}),
      D3.max(data, function(d) {return d.population; })])
    .range([2, 40]);

  // append the chart to the website and set height&width
  var chart = D3.select("body")
  	.append("svg:svg")
  	.attr("width", 1000)
  	.attr("height", 500)

  // draw the bubbles
  var g = chart.append("svg:g");
  g.selectAll("scatter-dots")

    .data(data)
    .enter().append("svg:circle")
        .attr("cx", function(d,i) {return x(d.income);})
        .attr("cy", function(d) return y(d.health);})
        .attr("r", function(d) {return size(d.population);});
  });

  </script></body></html>

工具21:D3.js Templates

...nv.addGraph(function() {

    var chart = nv.models.scatter() //define that it's a scatterplot
        .xScale(D3.scale.log()) //log scale
        .pointRange([10, 5000]) //define bubble sizes
        .color(['black']); //set color

    D3.select('#chart') //select the div in which the chart should be plotted
        .datum(exampleData)
        .call(chart);

    //plot the chart
    return chart;});

工具22:Highcharts.js

<!DOCTYPE HTML><html>
  <head>
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js" type="text/javascript"></script>
    <script src="https://code.highcharts.com/highcharts.js"></script>
    <script src="https://code.highcharts.com/modules/data.js"></script>
    <script src="https://code.highcharts.com/highcharts-more.js"></script>
  </head>
  <body>
    <div id="chart"></div>

    <script>
    var url = 'data.csv';
    $.get(url, function(csv) {

    // A hack to see through quoted text in the CSV
    csv = csv.replace(/(,)(?=(?:[^"]|"[^"]*")*$)/g, '|');

    $('#chart').highcharts({
      chart: {
        type: 'bubble'
      },

      data: {
        csv: csv,
        itemDelimiter: '|',
        seriesMapping: [{
          name: 0,
          x: 1,
          y: 2,
          z: 3
          }]
        },

        xAxis: {
          type: "logarithmic"
        },
        colors: ["#000000"],
      });
    });

    </script>
  </body></html>

工具23:Vega

{
  "width": 1000,
  "height": 500,
  "data": [
    {
      "name": "data",
      "url": "data.csv",
      "format": {
        "type": "csv",
        "parse": {
          "income": "number"
        }
      }
    }
  ],
  "scales": [
    {
      "name": "xscale",
      "type": "log",
      "domain": {
        "data": "data",
        "field": ["income"]
      },
      "range": "width",
      "nice": true,
      "zero": true
    },
    {
      "name": "yscale",
      "type": "linear",
      "domain": {
        "data": "data",
        "field": ["health"]
      },
      "range": "height",
      "zero": false
    },
    {
      "name": "size",
      "type": "linear",
      "domain": {
        "data": "data",
        "field": "population"
      },
      "range": [0,700]
    }
  ],
  "axes": [
    {
      "type": "x",
      "scale": "xscale",
      "orient": "bottom"
    },
    {
      "type": "y",
      "scale": "yscale",
      "orient": "left"
    }
  ],
  "marks": [
    {
      "type": "symbol",
      "from": {
        "data": "data"
      },
      "properties": {
        "enter": {
          "x": {
            "field": "income",
            "scale": "xscale"
          },
          "y": {
            "field": "health",
            "scale": "yscale"
          },
          "size": {
            "field":"population",
            "scale":"size",
            "shape":"cross"
          },
          "fill": {"value": "#000"},
          "opacity": {"value": 0.6}
        }
      }
    }
  ]}

工具24:Vega Lite

{
  "data": {"url": "data.csv", "formatType": "csv"},
  "mark": "circle",
  "encoding": {
    "y": {
      "field": "health",
      "type": "quantitative",
      "scale": {"zero": false}
    },
    "x": {
      "field": "income",
      "type": "quantitative",
      "scale": {"type": "log"}
    },
    "size": {
      "field": "population",
      "type": "quantitative"
    },
    "color": {"value": "#000"}
  },
  "config": {"cell": {"width": 1000,"height": 500}}
  }

工具25:BIT 超级数据分析平台

END.

来源:数据君微信公众datakong

本文分享自微信公众号 - PPV课数据科学社区(ppvke123)

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2017-04-07

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 从0到1掌握R语言网络爬虫

    引言 网上的数据和信息无穷无尽,如今人人都用百度谷歌来作为获取知识,了解新鲜事物的首要信息源。所有的这些网上的信息都是直接可得的,而为了满足日益增长的数据需求,...

    小莹莹
  • 【学习】笨办法学R编程(三)

    看到各位对“笨办法系列”的东西还比较感兴趣,我也很乐意继续写下去。今天的示例将会用到数据框(data.frame)这种数据类型,并学习如何组合计算...

    小莹莹
  • 聚类分析:k-means和层次聚类

    尽管我个人非常不喜欢人们被划分圈子,因为这样就有了歧视、偏见、排挤和矛盾,但“物以类聚,人以群分”确实是一种客观的现实——这其中就蕴含着聚类分析的思想。 前面所...

    小莹莹
  • R语言基础绘图教程——第4章:面积图和饼图

    DoubleHelix
  • 原创 | 实战:R环境下Echart的8种可视化

    本文由CDA数据分析研究院曾珂提供,刘春娇整理,版权私有,侵权必究,转载请注明出处。 总结一下2016年5月29日数据科学家训练营R语言课程中Echart学习...

    CDA数据分析师
  • 掌握此心法,可以纵横 Numpy 世界而无大碍

    检查一个 ndarray 数据的维度和大小,分别用 ndim 和 shape 属性。

    崔庆才
  • 【Java框架型项目从入门到装逼】第十三节 用户新增功能完结篇

    剽悍一小兔
  • 循环神经网络(四) ——words2vec、负采样、golve

    用户1327360
  • 只会爬虫不会反爬虫?动图详解利用 User-Agent 进行反爬虫的原理和绕过方法!

    随着 Python 和大数据的火热,大量的工程师蜂拥而上,爬虫技术由于易学、效果显著首当其冲的成为了大家追捧的对象,爬虫的发展进入了高峰期,因此给服务器带来的压...

    崔庆才
  • Mybatis中自定义实例化SqlSessionFactoryBean

    现在SpringBoot基本成为开发的标配,如果你上司让你搭建一个SpringBoot,然后集成Mybatis+Druid,你可以能百度几下,卡卡就搭建完毕了。

    林老师带你学编程

扫码关注云+社区

领取腾讯云代金券