标签归档:自然语言处理

谷歌云平台上基于TensorFlow的高级机器学习专项课程

Deep Learning Specialization on Coursera

Coursera近期推了一门新专项课程:谷歌云平台上基于TensorFlow的高级机器学习专项课程(Advanced Machine Learning with TensorFlow on Google Cloud Platform Specialization),看起来很不错。这个系列包含5门子课程,涵盖端到端机器学习、生产环境机器学习系统、图像理解、面向时间序列和自然语言处理的序列模型、推荐系统等内容,感兴趣的同学可以关注:Learn Advanced Machine Learning with Google Cloud. Build production-ready machine learning models with TensorFlow on Google Cloud Platform.

课程链接:http://coursegraph.com/coursera-specializations-advanced-machine-learning-tensorflow-gcp
继续阅读

BERT相关论文、文章和代码亚美游AMG88汇总

Deep Learning Specialization on Coursera

BERT最近太火,蹭个热点,整理一下相关的亚美游AMG88,包括Paper, 代码和文章解读。

1、Google官方:

1) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

一切始于10月Google祭出的这篇Paper, 瞬间引爆整个AI圈包括自媒体圈: https://arxiv.org/abs/1810.04805

2) Github: https://github.com/google-research/bert

11月Google推出了代码和预训练模型,再次引起群体亢奋。

3) Google AI Blog: Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing

2、第三方解读:
1) 张俊林博士的解读, 知乎专栏:从Word Embedding到Bert模型—自然语言处理中的预训练技术发展史

我们在AINLP微信公众号上转载了这篇文章和张俊林博士分享的PPT,欢迎关注:

2) 知乎: 如何评价 BERT 模型?

3) 【NLP】Google BERT详解

4) [NLP自然语言处理]谷歌BERT模型深度解析

5) BERT Explained: State of the art language model for NLP

6) BERT介绍

7) 论文解读:BERT模型及fine-tuning

8) NLP突破性成果 BERT 模型详细解读

9) 干货 | BERT fine-tune 终极实践教程: 奇点智能BERT实战教程,在AI Challenger 2018阅读理解任务中训练一个79+的模型。

10) 【BERT详解】《Dissecting BERT》by Miguel Romero Calvo
Dissecting BERT Part 1: The Encoder
Understanding BERT Part 2: BERT Specifics
Dissecting BERT Appendix: The Decoder

3、第三方代码:

1) pytorch-pretrained-BERT: https://github.com/huggingface/pytorch-pretrained-BERT
Google官方推荐的PyTorch BERB版本实现,可加载Google预训练的模型:PyTorch version of Google AI's BERT model with script to load Google's pre-trained models

2) BERT-pytorch: https://github.com/codertimo/BERT-pytorch
另一个Pytorch版本实现:Google AI 2018 BERT pytorch implementation

3) BERT-tensorflow: https://github.com/guotong1988/BERT-tensorflow
Tensorflow版本:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

4) bert-chainer: https://github.com/soskek/bert-chainer
Chanier版本: Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

5) bert-as-service: https://github.com/hanxiao/bert-as-service
将不同长度的句子用BERT预训练模型编码,映射到一个固定长度的向量上:Mapping a variable-length sentence to a fixed-length vector using pretrained BERT model
这个很有意思,在这个基础上稍进一步是否可以做一个句子相似度计算服务?有没有同学一试?

6) bert_language_understanding: https://github.com/brightmart/bert_language_understanding
BERT实战:Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN

7) sentiment_analysis_fine_grain: https://github.com/brightmart/sentiment_analysis_fine_grain
BERT实战,多标签文本分类,在 AI Challenger 2018 细粒度情感分析任务上的尝试:Multi-label Classification with BERT; Fine Grained Sentiment Analysis from AI challenger

8) BERT-NER: https://github.com/kyzhouhzau/BERT-NER
BERT实战,命名实体识别: Use google BERT to do CoNLL-2003 NER !

9) BERT-keras: https://github.com/Separius/BERT-keras
Keras版: Keras implementation of BERT with pre-trained weights

10) tbert: https://github.com/innodatalabs/tbert
PyTorch port of BERT ML model

11) BERT-Classification-Tutorial: https://github.com/Socialbird-AILab/BERT-Classification-Tutorial

12) BERT-BiLSMT-CRF-NER: https://github.com/macanv/BERT-BiLSMT-CRF-NER
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning

13) bert-Chinese-classification-task
bert中文分类实践

14) bert-chinese-ner: https://github.com/ProHiryu/bert-chinese-ner
使用预训练语言模型BERT做中文NER

持续更新,BERT更多相关亚美游AMG88欢迎补充,欢迎关注我们的微信公众号:AINLP

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:

本文链接地址:BERT相关论文、文章和代码亚美游AMG88汇总 /?p=10870

一文详解深度学习在命名实体识别(NER)中的应用

Deep Learning Specialization on Coursera

近几年来,基于神经网络的深度学习方法在计算机视觉、语音识别等领域取得了巨大成功,另外在自然语言处理领域也取得了不少进展。在NLP的关键性基础任务—命名实体识别(Named Entity Recognition,NER)的研究中,深度学习也获得了不错的效果。最近,笔者阅读了一系列基于深度学习的NER研究的相关论文,并将其应用到达观的NER基础模块中,在此进行一下总结,与大家一起分享学习。

1、NER 简介

NER又称作专名识别,是自然语言处理中的一项基础任务,应用范围非常广泛。命名实体一般指的是文本中具有特定意义或者指代性强的实体,通常包括人名、地名、组织机构名、日期时间、专有名词等。NER系统就是从非结构化的输入文本中抽取出上述实体,并且可以按照业务需求识别出更多类别的实体,比如产品名称、型号、价格等。因此实体这个概念可以很广,只要是业务需要的特殊文本片段都可以称为实体。

学术上NER所涉及的命名实体一般包括3大类(实体类,时间类,数字类)和7小类(人名、地名、组织机构名、时间、日期、货币、百分比)。

实际应用中,NER模型通常只要识别出人名、地名、组织机构名、日期时间即可,一些系统还会给出专有名词结果(比如缩写、会议名、产品名等)。货币、百分比等数字类实体可通过正则搞定。另外,在一些应用场景下会给出特定领域内的实体,如书名、歌曲名、期刊名等。

NER是NLP中一项基础性关键任务。从自然语言处理的流程来看,NER可以看作词法分析中未登录词识别的一种,是未登录词中数量最多、识别难度最大、对分词效果影响最大问题。同时NER也是关系抽取、事件抽取、知识图谱、机器翻译、问答系统等诸多NLP任务的基础。

NER当前并不算是一个大热的研究课题,因为学术界部分学者认为这是一个已经解决的问题。当然也有学者认为这个问题还没有得到很好地解决,原因主要有:命名实体识别只是在有限的文本类型(主要是新闻语料中)和实体类别(主要是人名、地名、组织机构名)中取得了不错的效果;与其他信息检索领域相比,实体命名评测预料较小,容易产生过拟合;命名实体识别更侧重高召回率,但在信息检索领域,高准确率更重要;通用的识别多种类型的命名实体的系统性能很差。

2. 深度学习方法在NER中的应用

NER一直是NLP领域中的研究热点,从早期基于词典和规则的方法,到传统机器学习的方法,到近年来基于深度学习的方法,NER研究进展的大概趋势大致如下图所示。

图1:NER发展趋势

在基于机器学习的方法中,NER被当作序列标注问题。利用大规模语料来学习出标注模型,从而对句子的各个位置进行标注。NER 任务中的常用模型包括生成式模型HMM、判别式模型CRF等。条件随机场(ConditionalRandom Field,CRF)是NER目前的主流模型。它的目标函数不仅考虑输入的状态特征函数,而且还包含了标签转移特征函数。在训练时可以使用SGD学习模型参数。在已知模型时,给输入序列求预测输出序列即求使目标函数最大化的最优序列,是一个动态规划问题,可以使用Viterbi算法解码来得到最优标签序列。CRF的优点在于其为一个位置进行标注的过程中可以利用丰富的内部及上下文特征信息。

图2:一种线性链条件随机场

近年来,随着硬件计算能力的发展以及词的分布式表示(word embedding)的提出,神经网络可以有效处理许多NLP任务。这类方法对于序列标注任务(如CWS、POS、NER)的处理方式是类似的:将token从离散one-hot表示映射到低维空间中成为稠密的embedding,随后将句子的embedding序列输入到RNN中,用神经网络自动提取特征,Softmax来预测每个token的标签。

这种方法使得模型的训练成为一个端到端的过程,而非传统的pipeline,不依赖于特征工程,是一种数据驱动的方法,但网络种类繁多、对参数设置依赖大,模型可解释性差。此外,这种方法的一个缺点是对每个token打标签的过程是独立的进行,不能直接利用上文已经预测的标签(只能靠隐含状态传递上文信息),进而导致预测出的标签序列可能是无效的,例如标签I-PER后面是不可能紧跟着B-PER的,但Softmax不会利用到这个信息。

学界提出了DL-CRF模型做序列标注。在神经网络的输出层接入CRF层(重点是利用标签转移概率)来做句子级别的标签预测,使得标注过程不再是对各个token独立分类。

2.1 BiLSTM-CRF

LongShort Term Memory网络一般叫做LSTM,是RNN的一种特殊类型,可以学习长距离依赖信息。LSTM 由Hochreiter &Schmidhuber (1997)提出,并在近期被Alex Graves进行了改良和推广。在很多问题上,LSTM 都取得了相当巨大的成功,并得到了广泛的使用。LSTM 通过巧妙的设计来解决长距离依赖问题。

所有 RNN 都具有一种重复神经网络单元的链式形式。在标准的RNN中,这个重复的单元只有一个非常简单的结构,例如一个tanh层。

图3:传统RNN结构

LSTM 同样是这样的结构,但是重复的单元拥有一个不同的结构。不同于普通RNN单元,这里是有四个,以一种非常特殊的方式进行交互。

图4:LSTM结构

LSTM通过三个门结构(输入门,遗忘门,输出门),选择性地遗忘部分历史信息,加入部分当前输入信息,最终整合到当前状态并产生输出状态。

图5:LSTM各个门控结构

应用于NER中的biLSTM-CRF模型主要由Embedding层(主要有词向量,字向量以及一些额外特征),双向LSTM层,以及最后的CRF层构成。实验结果表明biLSTM-CRF已经达到或者超过了基于丰富特征的CRF模型,成为目前基于深度学习的NER方法中的最主流模型。在特征方面,该模型继承了深度学习方法的优势,无需特征工程,使用词向量以及字符向量就可以达到很好的效果,如果有高质量的词典特征,能够进一步获得提高。

图6:biLSTM-CRF结构示意图

2.2 IDCNN-CRF

对于序列标注来讲,普通CNN有一个不足,就是卷积之后,末层神经元可能只是得到了原始输入数据中一小块的信息。而对NER来讲,整个输入句子中每个字都有可能对当前位置的标注产生影响,即所谓的长距离依赖问题。为了覆盖到全部的输入信息就需要加入更多的卷积层,导致层数越来越深,参数越来越多。而为了防止过拟合又要加入更多的Dropout之类的正则化,带来更多的超参数,整个模型变得庞大且难以训练。因为CNN这样的劣势,对于大部分序列标注问题人们还是选择biLSTM之类的网络结构,尽可能利用网络的记忆力记住全句的信息来对当前字做标注。

但这又带来另外一个问题,biLSTM本质是一个序列模型,在对GPU并行计算的利用上不如CNN那么强大。如何能够像CNN那样给GPU提供一个火力全开的战场,而又像LSTM这样用简单的结构记住尽可能多的输入信息呢?

Fisher Yu and Vladlen Koltun 2015 提出了dilated CNN模型,意思是“膨胀的”CNN。其想法并不复杂:正常CNN的filter,都是作用在输入矩阵一片连续的区域上,不断sliding做卷积。dilated CNN为这个filter增加了一个dilation width,作用在输入矩阵的时候,会skip所有dilation width中间的输入数据;而filter本身的大小保持不变,这样filter获取到了更广阔的输入矩阵上的数据,看上去就像是“膨胀”了一般。

具体使用时,dilated width会随着层数的增加而指数增加。这样随着层数的增加,参数数量是线性增加的,而receptive field却是指数增加的,可以很快覆盖到全部的输入数据。

图7:idcnn示意图

图7中可见感受域是以指数速率扩大的。原始感受域是位于中心点的1x1区域:

(a)图中经由原始感受域按步长为1向外扩散,得到8个1x1的区域构成新的感受域,大小为3x3;

(b)图中经过步长为2的扩散,上一步3x3的感受域扩展为为7x7;

(c)图中经步长为4的扩散,原7x7的感受域扩大为15x15的感受域。每一层的参数数量是相互独立的。感受域呈指数扩大,但参数数量呈线性增加。

对应在文本上,输入是一个一维的向量,每个元素是一个character embedding:

图8:一个最大膨胀步长为4的idcnn块

IDCNN对输入句子的每一个字生成一个logits,这里就和biLSTM模型输出logits完全一样,加入CRF层,用Viterbi算法解码出标注结果。

在biLSTM或者IDCNN这样的网络模型末端接上CRF层是序列标注的一个很常见的方法。biLSTM或者IDCNN计算出的是每个词的各标签概率,而CRF层引入序列的转移概率,最终计算出loss反馈回网络。

3. 实战应用

3.1 语料准备

Embedding:我们选择中文维基百科语料来训练字向量和词向量。

基础语料:选择人民日报1998年标注语料作为基础训练语料。

附加语料:98语料作为官方语料,其权威性与标注正确率是有保障的。但由于其完全取自人民日报,而且时间久远,所以对实体类型覆盖度比较低。比如新的公司名,外国人名,外国地名。为了提升对新类型实体的识别能力,我们收集了一批标注的新闻语料。主要包括财经、娱乐、体育,而这些正是98语料中比较缺少的。由于标注质量问题,额外语料不能加太多,约98语料的1/4。

3.2 数据增强

对于深度学习方法,一般需要大量标注语料,否则极易出现过拟合,无法达到预期的泛化能力。我们在实验中发现,通过数据增强可以明显提升模型性能。具体地,我们对原语料进行分句,然后随机地对各个句子进行bigram、trigram拼接,最后与原始句子一起作为训练语料。

另外,我们利用收集到的命名实体词典,采用随机替换的方式,用其替换语料中同类型的实体,得到增强语料。

下图给出了BiLSTM-CRF模型的训练曲线,可以看出收敛是很缓慢的。相对而言,IDCNN-CRF模型的收敛则快很多。

图9:BiLSTM-CRF的训练曲线

图10:IDCNN-CRF的训练曲线

3.3 实例

以下是用BiLSTM-CRF模型的一个实例预测结果。

图11:BiLSTM-CRF预测实例

4. 总结

最后进行一下总结,将神经网络与CRF模型相结合的CNN/RNN-CRF成为了目前NER的主流模型。对于CNN与RNN,并没有谁占据绝对优势,各有各的优点。由于RNN有天然的序列结构,所以RNN-CRF使用更为广泛。基于神经网络结构的NER方法,继承了深度学习方法的优点,无需大量人工特征。只需词向量和字向量就能达到主流水平,加入高质量的词典特征能够进一步提升效果。对于少量标注训练集问题,迁移学习,半监督学习应该是未来研究的重点。

ABOUT

88集团赠送38彩金作者

朱耀邦:达观数据NLP算法工程师,负责达观数据NLP基础模块的研究、优化,以及NLP算法在文本挖掘系统中的具体应用。对深度学习、序列标注、实体及关系抽取有浓厚兴趣。

如何学习自然语言处理:NLP领域经典《自然语言处理综论》英文版第三版更新

Deep Learning Specialization on Coursera

如何学习NLP? 我觉得先要学好英语、数学和编程,因为英文世界的资料更丰富和原创,而数学会让你读论文的时候游刃有余、编程可以让你随时随地实现相关的idea。这好像是废话,那么闲话少说,进入正题。

去年写过一篇《如何学习自然语言处理:一本书和一门课》,介绍了NLP领域经典AMG88《自然语言处理综论(Speech and Language Processing)》第三版的相关情况,时隔一年,很多事情发生了变化,包括第二版的中文翻译版终于出了。作为NLP入门AMG88,十年前我读过这本书的第一版中文翻译版,第二版英文版;看到第二版中文翻译版和当前第三版英文版的相关内容,仿佛一个时代的跨越。

貌似为了方便2018年(斯坦福)秋季课程的原因,该书作者,NLP领域的大神 Daniel Jurafsky 教授和 James H. Martin 教授发布了一个截止2018年9月23日的单pdf文件:Speech and Language Processing (3rd ed. draft),包含了目前已经完成的所有章节,供用户下载和使用:

This is the release for the start of fall term 2018.
The slides are in the process of being updated now, we are putting them up as we write them.

Significantly rewritten version of 5, 6, 7, 8, 17, 18, 19, 23, 24, 25, and a draft of 9! New pedagogical sequences on neural networks and their training, starting with logistic regression and continuing with embeddings, feed-forward nets, and RNNs. Plus new or improved coverage of BPE, tf-idf, bias in embeddings, beam search decoding, HMMs, connotation frames, lexicon induction. reading comprehension/QA. Some chapters have been moved to the Appendix.

New lecture slides (so far) for chapters 6 and 25.

Here's a single pdf of the whole book-so-far!

Typos and comments welcome (just email slp3edbugs@gmail.com and let us know the date on the draft)!
And feel free to use the draft slides in your classes.

When will the book be finished? We're shooting for late 2019.

与之前的版本相比,重写了5、6、7、8、17、18、19、23、24、25章节的大部分内容和并新增了第9章节“递归神经网络中的序列处理(Sequence Processing with Recurrent Networks)”的草稿;调整了神经网络及其训练的教学顺序,从逻辑回归开始,到(词)嵌入,前馈网络以及递归神经网络;新增或者加大了BPE处理、tf-idf、柱搜索解码、隐马尔可夫模型、词典推理、阅读理解、自动问答等内容;一些旧的章节被移到附录。

另一个大家比较关心的问题,英文版第三版什么时候完工?官方预计要到2019年年底了。这本书英文版第一版自2000年出版,第二版英文版2008年出版,至今跨越接近20年,特别是这几年深度学习的风生水起,第三版增加了很多NLP和深度学习相关的内容,相对第二版变化有些大,这个第三版已完成章节的电子版草稿,总计有558页,估计全书完成时要秒杀第二版的厚度。

88集团赠送38彩金作者,两位都是NLP领域的神牛,以下是第二版中文翻译版中详细的介绍:

Daniel Jurafsky现任斯坦福大学语言学系和计算机科学系副教授。在此之前,他曾在博尔德的科罗拉多大学语言学系、计算机科学系和认知科学研究所任职。他出生于纽约州的Yonkers,1983年获语言学学士,1992年获计算机科学博士,两个学位都在伯克利加利福尼亚大学获得。他于1998年获得美国国家基金会CAREER奖,2002年获得Mac-Arthur奖。他发表过90多篇论文,内容涉及语音和语音处理的广泛领域。James H. Martin现任博尔德的科罗拉多大学语言学系、计算机科学系教授,认知科学研究所研究员。他出生于纽约市,1981年获可伦比亚大学计算机科学学士,1988年获伯克利加利福尼亚大学计算机科学博士。他写过70多篇88集团赠送38彩金计算机科学的论著,出版过《隐喻解释的计算机模型》(A Computational Model of Metaphor Interpretation)一书。

最后是如何下载这个电子版,其实官网上已经提供了相关的下载链接:https://web.stanford.edu/~jurafsky/slp3/ ,这篇文章上面的pdf也直接链向下载链接 ,如果还是无法下载这个电子版,可以关注我们的公众号:"NLPJob" , 回复 "slp3" 获取该书电子版以及 Daniel Jurafsky 教授之前在Coursera上开播的斯坦福大学自然语言处理课程相关资料视频(目前已绝版),一并学习自然语言处理。

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:

本文链接地址:如何学习自然语言处理:NLP领域经典《自然语言处理综论》英文版第三版更新 /?p=10785

AI Challenger 2018 细粒度用户评论情感分析 fastText Baseline

Deep Learning Specialization on Coursera

上一篇《AI Challenger 2018 进行时》文尾我们提到 AI Challenger 官方已经在 GitHub 上提供了多个赛道的 Baseline: AI Challenger 2018 Baseline ,其中文本挖掘相关的3个主赛道均有提供,非常适合用来学习:英中文本机器翻译的 baseline 就直接用了Google官方基于Tensorflow实现的Tensor2Tensor跑神经网络机器翻译Transformer模型,这个思路是我在去年《AI Challenger 2017 奇遇记》里的终极方案,今年已成标配;细粒度用户评论情感分析提供了一个基于支持向量机(SVM)的多分类模型 baseline;观点型问题阅读理解提供一个深度学习模型 baseline , 基于pytorch实现论文《Multiway Attention Networks for Modeling Sentence Pairs》里的思路。

本次 AI Challenger 2018, 除了英中文本机器翻译,另一个我比较关注的赛道是: 细粒度用户评论情感分析。情感分析是自然语言处理里面的一个经典任务,估计很多同学入门NLP的时候都玩过 IMDB Movie Reviews Dataset , 这个可以定义为一个二分类的情感分类问题。不过这次 AI Challenger 的细粒度用户评论情感分析问题,并不是这么简单:
继续阅读

AI Challenger 2018 进行时

Deep Learning Specialization on Coursera

之前写过一篇《AI Challenger 2017 奇遇记》,记录了去年参加 AI Challenger 英中机器文本翻译比赛和英中机器同声传译比赛的过程,得到了一些反馈,特别是一些同学私下留言希望共享语料做科研用,但是限于去年比赛AI Challenger官方的约定,无法私下分享。不过好消息是,AI Challenger 2018 新赛季已经于8月29号启动,总奖金高达300万人民币,单个赛道冠军奖金最高到40万人民币。新赛季英中机器翻译文本大赛继续,提供了一批新的语料,中英双语句对规模大致到了1千3百万句对的水平,真的很赞。

我之前没有参加这类数据竞赛的经验,去年因为做 AIpatent专利机器翻译 产品的缘故,参加了 AI Challenger 2017 两个与机器翻译相关的赛道,并且侥幸进了英中机器同声传译比赛的 Top 5,过程中最大的收获其实是 follow 了一轮最新的神经网络机器翻译模型和试用了一些相关的NMT开源工具,另外也跟踪了机器翻译相关的论文,了解了当前机器翻译的进展情况,这些对于我的工作还是有相当帮助的。

10年前读研的时候,没有MOOC,没有Kaggle,也没有这么多开源的深度学习平台和工具,有时候不得不感慨,对于搞数据挖掘的同学来说,这是最好的时代。对于还在校学习的同学,如果实验室的任务不重,强烈建议参加类似 AI Challenger, Kaggle 这样的比赛,这可能是除了实习之外,又一个很好的积累实战经验的方法之一。在 NLPJob ,我们已经发现有一些招聘方加了一条加分项,例如:有Kaggle比赛获奖或者其他竞赛获奖的优先。而类似的,我们也发现很多同学的简历中参加Kaggle, 天池大数据等竞赛的经历逐渐成了标配。面向校招,在校同学缺乏实战经验,如果又没有一些很好的实验室项目或者实习经历作为筹码,那么参加这类比赛不失为一个很好的简历补充方式。

以下选自 AI Challenger 2018 的相关官方介绍,其中五大主赛道有三个与自然语言处理相关,可见NLP是多么的难。

继续阅读

专利文本数据挖掘之AIpatent

Deep Learning Specialization on Coursera

这两年,我花了很多时间在专利文本数据挖掘上,这是一件很好玩的事情。目前我们的产品陆续上线了,感兴趣的朋友可以关注:

AIpatent专利翻译引擎http://t.aipatent.com

AIpatent专利科技词典http://d.aipatent.com/

AIpatent专利情报信息http://x.aipatent.com/

接下来,还有好玩的AIpatent专利检索产品,敬请期待。

Coursera上Python课程(公开课)汇总推荐:从Python入门到应用Python

Deep Learning Specialization on Coursera

Python是深度学习时代的语言,Coursera上有很多Python课程,从Python入门到精通,从Python基础语法到应用Python,满足各个层次的需求,以下是Coursera上的Python课程整理,仅供参考,这里也会持续更新。

1. 密歇根大学的“Python for Everybody Specialization(人人都可以学习的Python专项课程)”

这个系列对于学习者的编程背景和数学要求几乎为零,非常适合Python入门学习。这个系列也是Coursera上最受欢迎的Python学习系列课程,强烈推荐。这个Python系列的目标是“通过Python学习编程并分析数据,开发用于采集,清洗,分析和可视化数据的程序(Learn to Program and Analyze Data with Python-Develop programs to gather, clean, analyze, and visualize data.” ,以下是88集团赠送38彩金这个系列的简介:

This Specialization builds on the success of the Python for Everybody course and will introduce fundamental programming concepts including data structures, networked application program interfaces, and databases, using the Python programming language. In the Capstone Project, you’ll use the technologies learned throughout the Specialization to design and create your own applications for data retrieval, processing, and visualization.

这个系列包含4门子课程和1门毕业项目课程,包括Python入门基础,Python数据结构, 使用Python获取网络数据(Python爬虫),在Python中使用数据库以及Python数据可视化等。以下是具体子课程的介绍:

1.1 Programming for Everybody (Getting Started with Python)

Python入门级课程,这门课程暂且翻译为“人人都可以学编程-从Python开始”,如果没有任何编程基础,就从这门课程开始吧:

This course aims to teach everyone the basics of programming computers using Python. We cover the basics of how one constructs a program from a series of simple instructions in Python. The course has no pre-requisites and avoids all but the simplest mathematics. Anyone with moderate computer experience should be able to master the materials in this course. This course will cover Chapters 1-5 of the textbook “Python for Everybody”. Once a student completes this course, they will be ready to take more advanced programming courses. This course covers Python 3.

1.2 Python Data Structures(Python数据结构)

Python基础课程,这门课程的目标是介绍Python语言的核心数据结构(This course will introduce the core data structures of the Python programming language.),88集团赠送38彩金这门课程:

This course will introduce the core data structures of the Python programming language. We will move past the basics of procedural programming and explore how we can use the Python built-in data structures such as lists, dictionaries, and tuples to perform increasingly complex data analysis. This course will cover Chapters 6-10 of the textbook “Python for Everybody”. This course covers Python 3.

1.3 Using Python to Access Web Data(使用Python获取网页数据--Python爬虫)

Python应用课程,只有使用Python才能学以致用,这门课程的目标是展示如何通过爬取和分析网页数据将互联网作为数据的源泉(This course will show how one can treat the Internet as a source of data):

This course will show how one can treat the Internet as a source of data. We will scrape, parse, and read web data as well as access data using web APIs. We will work with HTML, XML, and JSON data formats in Python. This course will cover Chapters 11-13 of the textbook “Python for Everybody”. To succeed in this course, you should be familiar with the material covered in Chapters 1-10 of the textbook and the first two courses in this specialization. These topics include variables and expressions, conditional execution (loops, branching, and try/except), functions, Python data structures (strings, lists, dictionaries, and tuples), and manipulating files. This course covers Python 3.

1.4 Using Databases with Python(Python数据库)

Python应用课程,在Python中使用数据库。这门课程的目标是在Python中学习SQL,使用SQLite3作为抓取数据的存储数据库:

This course will introduce students to the basics of the Structured Query Language (SQL) as well as basic database design for storing data as part of a multi-step data gathering, analysis, and processing effort. The course will use SQLite3 as its database. We will also build web crawlers and multi-step data gathering and visualization processes. We will use the D3.js library to do basic data visualization. This course will cover Chapters 14-15 of the book “Python for Everybody”. To succeed in this course, you should be familiar with the material covered in Chapters 1-13 of the textbook and the first three courses in this specialization. This course covers Python 3.

1.5 Capstone: Retrieving, Processing, and Visualizing Data with Python(毕业项目课程:使用Python获取,处理和可视化数据)

Python应用实践课程,这是这个系列的毕业项目课程,目的是通过开发一系列Python应用项目让学生熟悉Python抓取,处理和可视化数据的流程。

In the capstone, students will build a series of applications to retrieve, process and visualize data using Python. The projects will involve all the elements of the specialization. In the first part of the capstone, students will do some visualizations to become familiar with the technologies in use and then will pursue their own project to visualize some other data that they have or can find. Chapters 15 and 16 from the book “Python for Everybody” will serve as the backbone for the capstone. This course covers Python 3.

2. 多伦多大学的编程入门课程"Learn to Program: The Fundamentals(学习编程:基础) "

Python入门级课程。这门课程以Python语言传授编程入门知识,实为零基础的Python入门课程。感兴趣的同学可以参考课程图谱上的老课程评论 :http://coursegraph.com/coursera_programming1 ,之前一个同学的评价是 “两个老师语速都偏慢,讲解细致,又有可视化工具Python Visualizer用于详细了解程序具体执行步骤,可以说是零基础学习python编程的最佳选择。”

Behind every mouse click and touch-screen tap, there is a computer program that makes things happen. This course introduces the fundamental building blocks of programming and teaches you how to write fun and useful programs using the Python language.

3. 莱斯大学的Python专项课程系列:Introduction to Scripting in Python Specialization

入门级Python学习系列课程,涵盖Python基础, Python数据表示, Python数据分析, Python数据可视化等子课程,比较适合Python入门。这门课程的目标是让学生可以在处理实际问题是使用Python解决问题:Launch Your Career in Python Programming-Master the core concepts of scripting in Python to enable you to solve practical problems.

This specialization is intended for beginners who would like to master essential programming skills. Through four courses, you will cover key programming concepts in Python 3 which will prepare you to use Python to perform common scripting tasks. This knowledge will provide a solid foundation towards a career in data science, software engineering, or other disciplines involving programming.

这个系列包含4门子课程,以下是具体子课程的介绍:

3.1 Python Programming Essentials(Python编程基础)

Python入门基础课程,这门课程将讲授Python编程基础知识,包括表达式,变量,函数等,目标是让用户熟练使用Python:

This course will introduce you to the wonderful world of Python programming! We'll learn about the essential elements of programming and how to construct basic Python programs. We will cover expressions, variables, functions, logic, and conditionals, which are foundational concepts in computer programming. We will also teach you how to use Python modules, which enable you to benefit from the vast array of functionality that is already a part of the Python language. These concepts and skills will help you to begin to think like a computer programmer and to understand how to go about writing Python programs. By the end of the course, you will be able to write short Python programs that are able to accomplish real, practical tasks. This course is the foundation for building expertise in Python programming. As the first course in a specialization, it provides the necessary building blocks for you to succeed at learning to write more complex Python programs. This course uses Python 3. While many Python programs continue to use Python 2, Python 3 is the future of the Python programming language. This first course will use a Python 3 version of the CodeSkulptor development environment, which is specifically designed to help beginning programmers learn quickly. CodeSkulptor runs within any modern web browser and does not require you to install any software, allowing you to start writing and running small programs immediately. In the later courses in this specialization, we will help you to move to more sophisticated desktop development environments.

3.2 Python Data Representations(Python数据表示)

Python入门基础课程,这门课程依然关注Python的基础知识,包括Python字符串,列表等,以及Python文件操作:

This course will continue the introduction to Python programming that started with Python Programming Essentials. We'll learn about different data representations, including strings, lists, and tuples, that form the core of all Python programs. We will also teach you how to access files, which will allow you to store and retrieve data within your programs. These concepts and skills will help you to manipulate data and write more complex Python programs. By the end of the course, you will be able to write Python programs that can manipulate data stored in files. This will extend your Python programming expertise, enabling you to write a wide range of scripts using Python This course uses Python 3. While most Python programs continue to use Python 2, Python 3 is the future of the Python programming language. This course introduces basic desktop Python development environments, allowing you to run Python programs directly on your computer. This choice enables a smooth transition from online development environments.

3.3 Python Data Analysis(Python数据分析)

Python基础课程,这门课程将讲授通过Python读取和分析表格数据和结构化数据等,例如TCSV文件等:

This course will continue the introduction to Python programming that started with Python Programming Essentials and Python Data Representations. We'll learn about reading, storing, and processing tabular data, which are common tasks. We will also teach you about CSV files and Python's support for reading and writing them. CSV files are a generic, plain text file format that allows you to exchange tabular data between different programs. These concepts and skills will help you to further extend your Python programming knowledge and allow you to process more complex data. By the end of the course, you will be comfortable working with tabular data in Python. This will extend your Python programming expertise, enabling you to write a wider range of scripts using Python. This course uses Python 3. While most Python programs continue to use Python 2, Python 3 is the future of the Python programming language. This course uses basic desktop Python development environments, allowing you to run Python programs directly on your computer.

3.4 Python Data Visualization(Python数据可视化)

Python应用课程,这门课程将基于前3门课程学习的Python知识,抓取网络数据,然后清洗,处理和分析数据,并最终可视化呈现数据:

This if the final course in the specialization which builds upon the knowledge learned in Python Programming Essentials, Python Data Representations, and Python Data Analysis. We will learn how to install external packages for use within Python, acquire data from sources on the Web, and then we will clean, process, analyze, and visualize that data. This course will combine the skills learned throughout the specialization to enable you to write interesting, practical, and useful programs. By the end of the course, you will be comfortable installing Python packages, analyzing existing data, and generating visualizations of that data. This course will complete your education as a scripter, enabling you to locate, install, and use Python packages written by others. You will be able to effectively utilize tools and packages that are widely available to amplify your effectiveness and write useful programs.

4. 莱斯大学的计算(机)基础专项课程系列:Fundamentals of Computing Specialization

入门级Python编程学习课程系列,这个系列覆盖了大部分莱斯大学一年级计算机科学新生的学习材料,学生通过Python学习现代编程语言技巧,并将这些技巧应用到20个左右的有趣的编程项目中。

This Specialization covers much of the material that first-year Computer Science students take at Rice University. Students learn sophisticated programming skills in Python from the ground up and apply these skills in building more than 20 fun projects. The Specialization concludes with a Capstone exam that allows the students to demonstrate the range of knowledge that they have acquired in the Specialization.

这个系列包括Python交互式编程设计,计算原理,算法思维等6门课程和1门毕业项目课程,目标是让学生像计算机科学家一样编程和思考(Learn how to program and think like a Computer Scientist),以下是子课程的相关介绍:

4.1 An Introduction to Interactive Programming in Python (Part 1)(Python交互式编程导论上)

Python入门级课程,这门课程将讲授Python编程基础知识,例如普通表达式,条件表达式和函数,并用这些知识构建一个简单的交互式应用。

This two-part course is designed to help students with very little or no computing background learn the basics of building simple interactive applications. Our language of choice, Python, is an easy-to learn, high-level computer language that is used in many of the computational courses offered on Coursera. To make learning Python easy, we have developed a new browser-based programming environment that makes developing interactive applications in Python simple. These applications will involve windows whose contents are graphical and respond to buttons, the keyboard and the mouse. In part 1 of this course, we will introduce the basic elements of programming (such as expressions, conditionals, and functions) and then use these elements to create simple interactive applications such as a digital stopwatch. Part 1 of this class will culminate in building a version of the classic arcade game "Pong".

4.2 An Introduction to Interactive Programming in Python (Part 2)(Python交互式编程导论下)

Python入门级课程,这门课程将继续讲授Python基础知识,例如列表,词典和循环,并将使用这些知识构建一个简单的游戏例如Blackjack:

This two-part course is designed to help students with very little or no computing background learn the basics of building simple interactive applications. Our language of choice, Python, is an easy-to learn, high-level computer language that is used in many of the computational courses offered on Coursera. To make learning Python easy, we have developed a new browser-based programming environment that makes developing interactive applications in Python simple. These applications will involve windows whose contents are graphical and respond to buttons, the keyboard and the mouse. In part 2 of this course, we will introduce more elements of programming (such as list, dictionaries, and loops) and then use these elements to create games such as Blackjack. Part 1 of this class will culminate in building a version of the classic arcade game "Asteroids". Upon completing this course, you will be able to write small, but interesting Python programs. The next course in the specialization will begin to introduce a more principled approach to writing programs and solving computational problems that will allow you to write larger and more complex programs.

4.3 Principles of Computing (Part 1)(计算原理上)

编程基础课程,这门课程聚焦在了编程的基础上,包括编码标准和测试,数学基础包括概率和组合等。

This two-part course builds upon the programming skills that you learned in our Introduction to Interactive Programming in Python course. We will augment those skills with both important programming practices and critical mathematical problem solving skills. These skills underlie larger scale computational problem solving and programming. The main focus of the class will be programming weekly mini-projects in Python that build upon the mathematical and programming principles that are taught in the class. To keep the class fun and engaging, many of the projects will involve working with strategy-based games. In part 1 of this course, the programming aspect of the class will focus on coding standards and testing. The mathematical portion of the class will focus on probability, combinatorics, and counting with an eye towards practical applications of these concepts in Computer Science. Recommended Background - Students should be comfortable writing small (100+ line) programs in Python using constructs such as lists, dictionaries and classes and also have a high-school math background that includes algebra and pre-calculus.

4.4 Principles of Computing (Part 2)(计算原理下)

编程基础课程,这门课程聚焦在搜索、排序、递归等主题上:

This two-part course introduces the basic mathematical and programming principles that underlie much of Computer Science. Understanding these principles is crucial to the process of creating efficient and well-structured solutions for computational problems. To get hands-on experience working with these concepts, we will use the Python programming language. The main focus of the class will be weekly mini-projects that build upon the mathematical and programming principles that are taught in the class. To keep the class fun and engaging, many of the projects will involve working with strategy-based games. In part 2 of this course, the programming portion of the class will focus on concepts such as recursion, assertions, and invariants. The mathematical portion of the class will focus on searching, sorting, and recursive data structures. Upon completing this course, you will have a solid foundation in the principles of computation and programming. This will prepare you for the next course in the specialization, which will begin to introduce a structured approach to developing and analyzing algorithms. Developing such algorithmic thinking skills will be critical to writing large scale software and solving real world computational problems.

4.5 Algorithmic Thinking (Part 1)(算法思维上)

编程基础课程,这门课程聚焦在算法思维的培养上,讲授图算法的相关概念并用Python实现:

Experienced Computer Scientists analyze and solve computational problems at a level of abstraction that is beyond that of any particular programming language. This two-part course builds on the principles that you learned in our Principles of Computing course and is designed to train students in the mathematical concepts and process of "Algorithmic Thinking", allowing them to build simpler, more efficient solutions to real-world computational problems. In part 1 of this course, we will study the notion of algorithmic efficiency and consider its application to several problems from graph theory. As the central part of the course, students will implement several important graph algorithms in Python and then use these algorithms to analyze two large real-world data sets. The main focus of these tasks is to understand interaction between the algorithms and the structure of the data sets being analyzed by these algorithms. Recommended Background - Students should be comfortable writing intermediate size (300+ line) programs in Python and have a basic understanding of searching, sorting, and recursion. Students should also have a solid math background that includes algebra, pre-calculus and a familiarity with the math concepts covered in "Principles of Computing".

4.6 Algorithmic Thinking (Part 2)(算法思维下)

编程基础课程,这门课程聚焦在培养学生的算法思维,并了解一些高级算法主题,例如分治法,动态规划等:

Experienced Computer Scientists analyze and solve computational problems at a level of abstraction that is beyond that of any particular programming language. This two-part class is designed to train students in the mathematical concepts and process of "Algorithmic Thinking", allowing them to build simpler, more efficient solutions to computational problems. In part 2 of this course, we will study advanced algorithmic techniques such as divide-and-conquer and dynamic programming. As the central part of the course, students will implement several algorithms in Python that incorporate these techniques and then use these algorithms to analyze two large real-world data sets. The main focus of these tasks is to understand interaction between the algorithms and the structure of the data sets being analyzed by these algorithms. Once students have completed this class, they will have both the mathematical and programming skills to analyze, design, and program solutions to a wide range of computational problems. While this class will use Python as its vehicle of choice to practice Algorithmic Thinking, the concepts that you will learn in this class transcend any particular programming language.

4.7 The Fundamentals of Computing Capstone Exam(计算基础毕业项目课程)

Python应用课程,基于以上子课程的学习,计算基础毕业项目课程将用Python和所学的知识完成 20+ 项目:

While most specializations on Coursera conclude with a project-based course, students in the "Fundamentals of Computing" specialization have completed more than 20+ projects during the first six courses of the specialization. Given that much of the material in these courses is reused from session to session, our goal in this capstone class is to provide a conclusion to the specialization that allows each student an opportunity to demonstrate their individual mastery of the material in the specialization. With this objective in mind, the focus in this Capstone class will be an exam whose questions are updated periodically. This approach is designed to help insure that each student is solving the exam problems on his/her own without outside help. For students that have done their own work, we do not anticipate that the exam will be particularly hard. However, those students who have relied too heavily on outside help in previous classes may have a difficult time. We believe that this approach will increase the value of the Certificate for this specialization.

5. 密歇根大学的 Applied Data Science with Python(Python数据科学应用专项课程系列)

Python应用系列课程,这个系列的目标主要是通过Python编程语言介绍数据科学的相关领域,包括应用统计学,机器学习,信息可视化,文本分析和社交网络分析等知识,并结合一些流行的Python工具包,例如pandas, matplotlib, scikit-learn, nltk以及networkx等Python工具。

The 5 courses in this University of Michigan specialization introduce learners to data science through the python programming language. This skills-based specialization is intended for learners who have basic a python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network analysis techniques through popular python toolkits such as pandas, matplotlib, scikit-learn, nltk, and networkx to gain insight into their data. Introduction to Data Science in Python (course 1), Applied Plotting, Charting & Data Representation in Python (course 2), and Applied Machine Learning in Python (course 3) should be taken in order and prior to any other course in the specialization. After completing those, courses 4 and 5 can be taken in any order. All 5 are required to earn a certificate.

这个系列课程有5门课程,包括Python数据科学导论课程(Introduction to Data Science in Python),Python数据可视化(Applied Plotting, Charting & Data Representation in Python),Python机器学习(Applied Machine Learning in Python) ,Python文本挖掘(Applied Text Mining in Python) , Python社交网络分析(Applied Social Network Analysis in Python),以下是具体子课程的介绍:

5.1 Introduction to Data Science in Python(Python数据科学导论)

Python基础和应用课程,这门课程从Python基础讲起,然后通过pandas数据科学库介绍DataFrame等数据分析中的核心数据结构概念,让学生学会操作和分析表格数据并学会运行基础的统计分析工具。

This course will introduce the learner to the basics of the python programming environment, including how to download and install python, expected fundamental python programming techniques, and how to find help with python programming questions. The course will also introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the DataFrame as the central data structure for data analysis. The course will end with a statistics primer, showing how various statistical measures can be applied to pandas DataFrames. By the end of the course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses. This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python.

5.2 Applied Plotting, Charting & Data Representation in Python(Python数据可视化)

Python应用课程,这门课程聚焦在通过使用matplotlib库进行数据图表的绘制和可视化呈现:

This course will introduce the learner to information visualization basics, with a focus on reporting and charting using the matplotlib library. The course will start with a design and information literacy perspective, touching on what makes a good and bad visualization, and what statistical measures translate into in terms of visualizations. The second week will focus on the technology used to make visualizations in python, matplotlib, and introduce users to best practices when creating basic charts and how to realize design decisions in the framework. The third week will describe the gamut of functionality available in matplotlib, and demonstrate a variety of basic statistical charts helping learners to identify when a particular method is good for a particular problem. The course will end with a discussion of other forms of structuring and visualizing data. This course should be taken after Introduction to Data Science in Python and before the remainder of the Applied Data Science with Python courses: Applied Machine Learning in Python, Applied Text Mining in Python, and Applied Social Network Analysis in Python.

5.3 Applied Machine Learning in Python(Python机器学习)

Python应用课程,这门课程主要聚焦在通过Python应用机器学习,包括机器学习和统计学的区别,机器学习工具包scikit-learn的介绍,有监督学习和无监督学习,数据泛化问题(例如交叉验证和过拟合)等。

This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit. The issue of dimensionality of data will be discussed, and the task of clustering data, as well as evaluating those clusters, will be tackled. Supervised approaches for creating predictive models will be described, and learners will be able to apply the scikit learn predictive modelling methods while understanding process issues related to data generalizability (e.g. cross validation, overfitting). The course will end with a look at more advanced techniques, such as building ensembles, and practical limitations of predictive models. By the end of this course, students will be able to identify the difference between a supervised (classification) and unsupervised (clustering) technique, identify which technique they need to apply for a particular dataset and need, engineer features to meet that need, and write python code to carry out an analysis. This course should be taken after Introduction to Data Science in Python and Applied Plotting, Charting & Data Representation in Python and before Applied Text Mining in Python and Applied Social Analysis in Python.

5.4 Applied Text Mining in Python(Python文本挖掘)

Python应用课程,这门课程主要聚焦在文本挖掘和文本分析基础,包括正则表达式,文本清洗,文本预处理等,并结合NLTK讲授自然语言处理的相关知识,例如文本分类,主题模型等。

This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling). This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.

5.5 Applied Social Network Analysis in Python(Python社交网络分析)

Python应用课程,这门课程通过Python工具包 NetworkX 介绍社交网络分析的相关知识。

This course will introduce the learner to network analysis through the NetworkX library. The course begins with an understanding of what network analysis is and motivations for why we might model phenomena as networks. The second week introduces the concept of connectivity and network robustness.. The third week will explore ways of measuring the importance or centrality of a node in a network. The final week will explore the evolution of networks over time and cover models of network generation and the link prediction problem. This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.

您可以继续在课程图谱上挖掘Coursera上新的Python课程,也欢迎推荐到这里。

注:本文首发“课程图谱博客”:http://blog.coursegraph.com ,同步发布到这里,原文链接地址:http://blog.coursegraph.com/coursera%E4%B8%8Apython%E8%AF%BE%E7%A8%8B%EF%BC%88%E5%85%AC%E5%BC%80%E8%AF%BE%EF%BC%89%E6%B1%87%E6%80%BB%E6%8E%A8%E8%8D%90%EF%BC%9A%E4%BB%8Epython%E5%85%A5%E9%97%A8%E5%88%B0%E5%BA%94%E7%94%A8python

深度学习课程及深度学习公开课亚美游AMG88整理

Deep Learning Specialization on Coursera

这里整理一批深度学习课程或者深度学习相关公开课的亚美游AMG88,持续更新,仅供参考。

1. Andrew Ng (吴恩达) 深度学习专项课程 by Coursera and deeplearning.ai

这是 Andrew Ng 老师离开百度后推出的第一个深度学习项目(deeplearning.ai)的一个课程: Deep Learning Specialization ,课程口号是:Master Deep Learning, and Break into AI. 作为 Coursera 联合创始人 和 机器学习网红课程 "Machine Learning" 的授课者,Andrew Ng 老师引领了数百万同学进入了机器学习领域,而这门深度学习课程的口号也透露了他的野心:继续带领百万人进入深度学习的圣地。

作为 Andrew Ng 老师的粉丝,依然推荐这门课程作为深度学习入门课程首选,并且建议花费上 Coursera 上的课程,一方面可以做题,另外还有证书,最重要的是它的编程作业,是理解课程内容的关键点,仅仅看视频绝对是达不到这个效果的。参考:《Andrew Ng 深度学习课程小记》和《Andrew Ng (吴恩达) 深度学习课程小结》。

2. Geoffrey Hinton 大神的 面向机器学习的神经网络(Neural Networks for Machine Learning)

Geoffrey Hinton大神的这门深度学习课程 2012年在 Coursera 上开过一轮,之后一直沉寂,直到 Coursera 新课程平台上线,这门课程已开过多轮次,来自课程图谱网友的评论:

"Deep learning必修课"

"宗派大师+开拓者直接讲课,秒杀一切二流子"

这门深度学习课程相对上面 Andrew Ng深度学习课程有一定难道,但是没有编程作业,只有Quiz.

3. 牛津大学深度学习课程(2015): Deep learning at Oxford 2015

这门深度学习课程名字虽然是 "Machine Learning 2014-2015",不过主要聚焦在深度学习的内容上,可以作为一门很系统的机器学习深度学习课程:

Machine learning techniques enable us to automatically extract features from data so as to solve predictive tasks, such as speech recognition, object recognition, machine translation, question-answering, anomaly detection, medical diagnosis and prognosis, automatic algorithm configuration, personalisation, robot control, time series forecasting, and much more. Learning systems adapt so that they can solve new tasks, related to previously encountered tasks, more efficiently.

The course focuses on the exciting field of deep learning. By drawing inspiration from neuroscience and statistics, it introduces the basic background on neural networks, back propagation, Boltzmann machines, autoencoders, convolutional neural networks and recurrent neural networks. It illustrates how deep learning is impacting our understanding of intelligence and contributing to the practical design of intelligent machines.

视频Playlist:https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu

参考:“牛津大学Nando de Freitas主讲的机器学习课程,重点介绍深度学习,还请来Deepmind的Alex Graves和Karol Gregor客座报告,内容、讲解都属一流,强烈推荐! 云: http://t.cn/RA2vSNX

4. Udacity 深度学习(中/英)by Google

Udacity (优达学城)上由Google工程师主讲的免费深度学习课程,结合Google自己的深度学习工具 Tensorflow ,很不错:

机器学习是发展最快、最令人兴奋的领域之一,而深度学习则代表了机器学习中最前沿但也最有风险的一部分。在本课内容中,你将透彻理解深度学习的动机,并设计用于了解复杂和/或大量数据库的智能系统。

我们将教授你如何训练和优化基本神经网络、卷积神经网络和长短期记忆网络。你将通过项目和任务接触完整的机器学习系统 TensorFlow。你将学习解决一系列曾经以为非常具有挑战性的新问题,并在你用深度学习方法轻松解决这些问题的过程中更好地了解人工智能的复杂属性。

我们与 Google 的首席科学家兼 Google 智囊团技术经理 Vincent Vanhoucke 联合开发了本课内容。此课程提供中文版本。

5. Udacity 纳米基石学位项目:深度学习

Udacity的纳米基石学位项目,收费课程,不过据说更注重实战:

人工智能正颠覆式地改变着我们的世界,而背后推动这场进步的,正是深度学习技术。优达学城和硅谷技术明星一起,带来这门帮你系统性入门的课程。你将通过充满活力的硅谷课程内容、独家实战项目和专业代码审阅,快速掌握深度学习的基础知识和前沿应用。

你在实战项目中的每行代码都会获得专业审阅和反馈,还可以在同步学习小组中,接受学长、导师全程的辅导和督促

6. fast.ai 上的深度学习系列课程

fast.ai上提供了几门深度学习课程,课程标语很有意思:Making neural nets uncool again ,并且 Our courses (all are free and have no ads):

Deep Learning Part 1: Practical Deep Learning for Coders
Why we created the course
What we cover in the course
Deep Learning Part 2: Cutting Edge Deep Learning for Coders
Computational Linear Algebra: Online textbook and Videos
Providing a Good Education in Deep Learning—our teaching philosophy
A Unique Path to Deep Learning Expertise—our teaching approach

7. 台大李宏毅老师深度学习课程:Machine Learning and having it Deep and Structured

难得的免费中文深度学习课程:

课程主页:http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS17.html
课程视频Playlist: https://www.youtube.com/playlist?list=PLJV_el3uVTsPMxPbjeX7PicgWbY7F8wW9
B站搬运深度学习课程视频: https://www.bilibili.com/video/av9770302/

8. 台大陈缊侬老师深度学习应用课程:Applied Deep Learning / Machine Learning and Having It Deep and Structured

据说是美女老师,这门课程16年秋季开过一次,不过没有视频,最新的这期是17年秋季课程,刚刚开课,Youtube上正在陆续放出课程视频:

16年课程主页,有Slides等相关资料:https://www.csie.ntu.edu.tw/~yvchen/f105-adl/index.html
17年课程主页,资料正在陆续放出:https://www.csie.ntu.edu.tw/~yvchen/f106-adl/
Youtube视频,目前没有playlist,可以关注其官方号放出的视频:https://www.youtube.com/channel/UCyB2RBqKbxDPGCs1PokeUiA/videos

9. Yann Lecun 深度学习公开课

"Yann Lecun 在 2016 年初于法兰西学院开课,这是其中88集团赠送38彩金深度学习的 8 堂课。当时是用法语授课,后来加入了英文字幕。
作为人工智能领域大牛和 Facebook AI 实验室(FAIR)的负责人,Yann Lecun 身处业内机器学习研究的最前沿。他曾经公开表示,现有的一些机器学习公开课内容已经有些过时。通过 Yann Lecun 的课程能了解到近几年深度学习研究的最新进展。该系列可作为探索深度学习的进阶课程。"

10. 2016 年蒙特利尔深度学习暑期班

推荐理由:看看嘉宾阵容吧,Yoshua Bengio 教授循环神经网络,Surya Ganguli 教授理论神经科学与深度学习理论,Sumit Chopra 教授 reasoning summit 和 attention,Jeff Dean 讲解 TensorFlow 大规模机器学习,Ruslan Salakhutdinov 讲解学习深度生成式模型,Ryan Olson 讲解深度学习的 GPU 编程,等等。

11. 斯坦福大学深度学习应用课程:CS231n: Convolutional Neural Networks for Visual Recognition

这门面向计算机视觉的深度学习课程由Fei-Fei Li教授掌舵,内容面向斯坦福大学学生,货真价实,评价颇高:

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.

12. 斯坦福大学深度学习应用课程: Natural Language Processing with Deep Learning

这门课程由NLP领域的大牛 Chris Manning 和 Richard Socher 执掌,绝对是学习深度学习自然语言处理的不二法门。

Natural language processing (NLP) is one of the most important technologies of the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Applications of NLP are everywhere because people communicate most everything in language: web search, advertisement, emails, customer service, language translation, radiology reports, etc. There are a large variety of underlying tasks and machine learning models behind NLP applications. Recently, deep learning approaches have obtained very high performance across many different NLP tasks. These models can often be trained with a single end-to-end model and do not require traditional, task-specific feature engineering. In this winter quarter course students will learn to implement, train, debug, visualize and invent their own neural network models. The course provides a thorough introduction to cutting-edge research in deep learning applied to NLP. On the model side we will cover word vector representations, window-based neural networks, recurrent neural networks, long-short-term-memory models, recursive neural networks, convolutional neural networks as well as some recent models involving a memory component. Through lectures and programming assignments students will learn the necessary engineering tricks for making neural networks work on practical problems.

这门课程融合了两位授课者之前在斯坦福大学的授课课程,分别是自然语言处理课程 cs224n (Natural Language Processing)和面向自然语言处理的深度学习课程 cs224d (Deep Learning for Natural Language Processing).

13. 斯坦福大学深度学习课程: CS 20SI: Tensorflow for Deep Learning Research

准确的说,这门课程主要是针对深度学习工具Tensorflow的:

Tensorflow is a powerful open-source software library for machine learning developed by researchers at Google Brain. It has many pre-built functions to ease the task of building different neural networks. Tensorflow allows distribution of computation across different computers, as well as multiple CPUs and GPUs within a single machine. TensorFlow provides a Python API, as well as a less documented C++ API. For this course, we will be using Python.

This course will cover the fundamentals and contemporary usage of the Tensorflow library for deep learning research. We aim to help students understand the graphical computational model of Tensorflow, explore the functions it has to offer, and learn how to build and structure models best suited for a deep learning project. Through the course, students will use Tensorflow to build models of different complexity, from simple linear/logistic regression to convolutional neural network and recurrent neural networks with LSTM to solve tasks such as word embeddings, translation, optical character recognition. Students will also learn best practices to structure a model and manage research experiments.

14. 牛津大学 & DeepMind 联合的面向NLP的深度学习应用课程: Deep Learning for Natural Language Processing: 2016-2017

课程主页:https://www.cs.ox.ac.uk/teaching/courses/2016-2017/dl/

github课程项目页面:https://github.com/oxford-cs-deepnlp-2017/

课程视频Playlist: https://www.youtube.com/playlist?list=PL613dYIGMXoZBtZhbyiBqb0QtgK6oJbpm

B站搬运视频: https://www.bilibili.com/video/av9817911/

15. 卡耐基梅隆大学(CMU)深度学习应用课程:CMU CS 11-747, Fall 2017 Neural Networks for NLP

课程主页:http://phontron.com/class/nn4nlp2017/

课程视频Playlist: https://www.youtube.com/watch?v=Sss2EA4hhBQ&list=PL8PYTP1V4I8ABXzdqtOpB_eqBlVAz_xPT

16. MIT组织的一个为期一周的深度学习课程: 6.S191: Introduction to Deep Learning http://introtodeeplearning.com/

17. 奈良先端科学技術大学院大学(NAIST) 2014年推出的一个深度学习短期课程(英文授课):Deep Learning and Neural Networks

18. Deep Learning course: lecture slides and lab notebooks

欢迎大家推荐其他没有覆盖到的深度学习课程。

注:本文首发“课程图谱博客”:http://blog.coursegraph.com ,同步发布到这里,原文链接地址:http://blog.coursegraph.com/深度学习课程亚美游AMG88整理,转载请注明出处。

如何学习自然语言处理:一本书和一门课

Deep Learning Specialization on Coursera

88集团赠送38彩金“如何学习自然语言处理”,有很多同学通过不同的途径留过言,这方面虽然很早之前写过几篇小文章:《如何学习自然语言处理》和《几本自然语言处理入门书》,但是更推崇知乎上这个问答:自然语言处理怎么最快入门,里面有微软亚洲研究院周明老师的系统回答和清华大学刘知远老师的倾情奉献:初学者如何查阅自然语言处理(NLP)领域学术资料,当然还包括其他同学的无私分享。

不过,对于希望入门NLP的同学来说,推荐你们先看一下这本书: Speech and Language Processing,第一版中文名译为《自然语言处理综论》,作者都是NLP领域的大大牛:斯坦福大学 Dan Jurafsky 教授和科罗拉多大学的 James H. Martin 教授。这也是我当年的入门书,我读过这本书的中文版(翻译自第一版英文版)和英文版第二版,该书第三版正在撰写中,作者已经完成了不少章节的撰写,所完成的章节均可下载:Speech and Language Processing (3rd ed. draft)。从章节来看,第三版增加了不少和NLP相关的深度学习的章节,内容和篇幅相对于之前有了更多的更新:

Chapter Slides Relation to 2nd ed.
1: Introduction [Ch. 1 in 2nd ed.]
2: Regular Expressions, Text Normalization, and Edit Distance Text [pptx] [pdf]
Edit Distance [pptx] [pdf]
[Ch. 2 and parts of Ch. 3 in 2nd ed.]
3: Finite State Transducers
4: Language Modeling with N-Grams LM [pptx] [pdf] [Ch. 4 in 2nd ed.]
5: Spelling Correction and the Noisy Channel Spelling [pptx] [pdf] [expanded from pieces in Ch. 5 in 2nd ed.]
6: Naive Bayes Classification and Sentiment NB [pptx] [pdf]
Sentiment [pptx] [pdf]
[new in this edition]
7: Logistic Regression
8: Neural Nets and Neural Language Models
9: Hidden Markov Models [Ch. 6 in 2nd ed.]
10: Part-of-Speech Tagging [Ch. 5 in 2nd ed.]
11: Formal Grammars of English [Ch. 12 in 2nd ed.]
12: Syntactic Parsing [Ch. 13 in 2nd ed.]
13: Statistical Parsing
14: Dependency Parsing [new in this edition]
15: Vector Semantics Vector [pptx] [pdf] [expanded from parts of Ch. 19 and 20 in 2nd ed.]
16: Semantics with Dense Vectors Dense Vector [pptx] [pdf] [new in this edition]
17: Computing with Word Senses: WSD and WordNet Intro, Sim [pptx] [pdf]
WSD [pptx] [pdf]
[expanded from parts of Ch. 19 and 20 in 2nd ed.]
18: Lexicons for Sentiment and Affect Extraction SentLex [pptx] [pdf] [new in this edition]
19: The Representation of Sentence Meaning
20: Computational Semantics
21: Information Extraction [Ch. 22 in 2nd ed.]
22: Semantic Role Labeling and Argument Structure SRL [pptx] [pdf]
Select [pptx] [pdf]
[expanded from parts of Ch. 19 and 20 in 2nd ed.]
23: Neural Models of Sentence Meaning (RNN, LSTM, CNN, etc.)
24: Coreference Resolution and Entity Linking
25: Discourse Coherence
26: Seq2seq Models and Summarization
27: Machine Translation
28: Question Answering
29: Conversational Agents
30: Speech Recognition
31: Speech Synthesis

另外该书作者之一斯坦福大学 Dan Jurafsky 教授曾经在Coursera上开设过一门自然语言处理课程:Natural Language Processing,该课程目前貌似在Coursera新课程平台上已经查询不到,不过我们在百度网盘上做了一个备份,包括该课程视频和该书的第二版英文,两个一起看,效果更佳:

2018.3 更新:链接: https://pan.baidu.com/s/1Wp35AyHY1PrmisA4deoC6Q 密码: sps4

对于一直寻找如何入门自然语言处理的同学来说,先把这本书和这套课程拿下来才是一个必要条件,万事先有个基础。

同时欢迎大家关注我们的公众号:NLPJob,回复"slp"获取该书和课程最新亚美游AMG88。