|
Yale School of Management: Assistant Professor of Marketing |
Seeking applications from graduating students, post-docs and others, with a Ph.D. or equivalent degree in Quantitative Marketing, Behavioral Marketing, or a related field. |
|
|
|
|
[D] Using ANNs on small data – Deep Learning vs. Xgboost |
submitted by /u/_alphamaximus_ [link] |
|
|
|
|
75 Big Data Terms to Know to Make your Dad Proud |
Here is a good list of 75 Big Data terms you can use to impress your father, even if you already bought him a gift.
|
|
|
|
|
[R] Value-Decomposition Networks For Cooperative Multi-Agent Learning |
submitted by /u/pauljasek [link] [comments]
|
|
|
|
|
The Machine Learning Algorithms Used in Self-Driving Cars |
Machine learning algorithms are used everywhere these days: from medical science to self driven cars. Here we explain how it helps to solves challenges in manufacturing of self driven cars.
|
|
|
|
|
[D] Where are you with your career in ML? Alternatively, how many are you are developers and now getting into ML? |
I'm currently a full-stack web developer, on a quest to dive as deep into ML as my skills can take me. I'm wondering who comes from a similar background as me, and how far along are you. Has anyone self-taught and made the leap from full-stack stuff to ML stuff? |
|
|
|
|
[D] Visualization tricks for 3-dim input and 2-dim output |
It is a RL setting, we have 2 dimensional input with 1 dimensional action (totally 3 dimension). The environment will generate 2 dimensional observations. We would like to visualize the behaviors of the environment. Can there be some good way to plot it, to be maximally informative. |
|
|
|
|
Top Stories, Jun 12-18: Top 15 Python Libraries for Data Science in 2017; Deep Learning Papers Reading Roadmap |
Top 15 Python Libraries for Data Science in 2017; Deep Learning Papers Reading Roadmap; The Practical Importance of Feature Selection; Understanding Deep Learning Requires Re-thinking Generalization; K-means Clustering with Tableau
|
|
|
|
|
[D] What are the real problem solving application of Elastic Weight Consolidation? |
About six months back deepmind published Overcoming catastrophic forgetting in neural networks. This paper was in the news because they found a way (EWC) to prevent "catastrophic forgetting". |
|
|
|
|
[P] Machine Learning for Image Content Analysis |
submitted by /u/jesueai [link] |
|
|
|
|
[D] CPU Max # of PCIe Lanes for a 4 GPU box |
Hey all I am planning to build a 4 GPU PC like Nvidia's devbox https://developer.nvidia.com/devbox for deep learning work (Not for gaming, SLI is not needed). |
|
|
|
|
[P] Practical Deep OCR for scene text using CTPN + CRNN |
submitted by /u/deepvideoanalytics |
|
|
|
|
[R] One Model To Learn Them All |
submitted by /u/xternalz [link] [comments]
|
|
|
|
|
Analytics Professionals: Get recognised, further your career, reduce your tax |
You can be recognised for your skills in data analytics in just six weeks by IAPA. Act before 30 June and claim the cost of the IAPA-certified via credential as a tax deduction.
|
|
|
|
|
[N] An AI Primer with Wojciech Zaremba @ YC Podcast |
submitted by /u/sherjilozair [link] |
|
|
|
|
[D] Making use of derivative information for Neural Networks |
Sometimes when designing a regression model for a function, say y(x), it can be useful to supply derivative data to the model, particularly if the number of datapoints is limited. This can often increase the accuracy of the y(x) prediction. |
|
|
|
|
[D] How would you use ML to detect fake user information? |
I was wondering how one could use machine learning techniques to detect "fake" users based on the personal information they provide (names, phone number, address...). I think it is a very interesting subject, but I could hardly find any paper on the subject. |
|
|
|
|
[D] What research papers optimize neural networks with sparse gradients? |
What papers are a must-read if I'm interested in how to optimize neural networks which are sparse (have many derivatives that are zero)? |
|
|
|
|
[P] python-recsys (SVD) with implicit feedback rather than ratings (recommender systems). |
I am building a simple recommender system using recsys libraries. Rather than "ratings data" I simply have implicit feedback of sales for items for users. Is it as simple as making my rating "1" for items where a sale has occurred and using SVD as is? Or will that not work at all? |
|
|
|
|
[P] GANGogh: Creating Art with GANs |
submitted by /u/rukjones4 [link] |
|
|
|
|
[P] Automatic Sub-Reddit Identifier By Parsing Reddit Titles - Fully working demo is ready now [Update] |
2-3 days ago I asked for help about creating a program that can automatically identify the correct sub-reddit category just by parsing the title. |
|
|
|
|
[D] I have questionnaire data with fixed questions and free text answers. What unsupervised techniques would you recommend to create a fixed feature space for each question? |
The number of training examples is very large - 30 million right now and will eventually grow to 200 million. 10 questions each with 2 - 3 sentences responses. The domain is health surveys from outpatient clinic visits. Happy to answer any other questions you might have. |
|
|
|
|
Differentiable Neural Computer - implementation and thoughts |
Hey guys, Here is my implementation of DeepMind's Differentiable Neural Computer. I tested it on copy and bAbI tasks and I've put up some visualizations of the learning progress t |
|
|
|
|
[P] A TensorFlow Implementation of the Transformer: Attention Is All You Need |
submitted by /u/longinglove [link] |
|
|
|
|
[P] Azure NV6 (M60 GPU) for Deep Learning |
For an upcoming project we will be experimenting with Deep Learning approaches for NLP in an Azure environment (Amazon and Local are not an option right now). Azure offers NC6 (K80) and NV6 (M60) instances, but due to region restrictions it might be that only the M60 will be available. |
|
|
|
|
[P] Indexing Faces on Instagram - Searching Facial Features on Instagram |
submitted by /u/kendrick__ [link] |
|
|
|
|
[D] What is the best way to use history related to each training example as a feature for model |
Lets say I am trying to predict how well a player will do in a game, and have some reasonable set of features for each player that can be used as training data and can use this to preform a reasonably well preforming model. |
|
|
|
|
[P]Code & Data now available for Phase-Functioned Neural Networks for Character Control! |
submitted by /u/undefdev [l |
|
|
|
|
What is knife? |
knyfe is a python utility for rapid exploration of datasets. Use it when you have some kind of dataset and you want to get a feel for how it is composed, run some simple tests on it, or prepare it for further processing. |
|
|
|
|
[P] LSTM - How are the inputs connected? |
In this image: http://colah.github.io/posts/2015-08-Understanding-LSTMs/img/LSTM3-chain.png we can see some LSTM nodes. I'm confused on what X(t-1), X, and X(t+1) are. |
|
|
|
|
Weekly Digest, June 19 |
Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. |
|
|
|
|
[N] Feeding Word2vec with tens of billions of items, what could possibly go wrong? |
submitted by /u/Agagla [link] |
|
|
|
|
[D] What unsolved problem keeps you up at night? |
I know researchers here are wary of getting sniped, so I'm not expecting specific ideas. I'm thinking of the crazy 'dream' ideas that you may have that aren't as approachable. |
|
|
|
|
[D] How to extract concept / topic from a text? |
Hi guys, my experience in the field of NLP is very limited. I use IBM Watson at the moment to extract the topic of text snippets (more or less 5 sentences and / or bullet points). |
|
|
|
|
[D] What Differs Humans, Machines, and Aliens |
submitted by /u/frangky |
|
|
|
|
Data Modelling Topologies of a Graph Database |
The associative data graph database model is still a heavy hitter, stacking up well against property graphs and triples/quadruples. Expect a comeback.
Definition
|
|
|
|
|
[D] How do people come up with all these crazy deep learning architectures? |
For the past few days, I've been reading TensorFlow source codes for some of the latest DL architectures (e.g. Tacotron, Wavenet) and the more I understand and visualize the architecture, the less sense it makes intuitively. |
|
|
|
|
[R] Stochastic Training of Neural Networks via Successive Convex Approximations |
submitted by /u/scardax88 [link] [comments]
|
|
|
|
|
[D] What smaller datasets do people use for POC for GANs? |
I mean I just want to try something if it will work at all. Since GANs are anyway quite unstable I was wondering what do people use for initial validation? MNIST or is there some other dataset which is usually used for smaller set of experiments. E.g. |
|
|
|
|
[P] Low loss but large amount of false positives? |
I'm trying to classify data into two classes and my loss is less than 0.01 under both MSE and BCE. This seems contradictory to me that my performance on the training set is still so low - the ratio of true positives to false positives is at least 1:5 even when sweeping the threshold. |
|
|
|
|
[P] Visually searching Craigslist for very specific car sub-models using Tensorflow |
I retrained inception v3 on two classes of images. One image is of a steering wheel without two buttons, and the other class of images is of the same steering wheel, but with the two buttons on the right side. |
|
|
|
|
[D] PyData Tel Aviv Meetup: Amir Balaish | Attention Models |
submitted by /u/_alphamaximus_ [link] |
|
|
|
|
[D] Choosing the Right Deep-RL Algorithm |
Hello! So my background isn't in Deep-RL, however, I have a good grasp on the concepts and enough skill programming to either write my own or sufficiently modify someone else's implementation of the latest and greatest DeepRL paper. |
|
|
|
|
[D] RNN's equivalent to MNIST? |
When playing with some ideas for convnets you may first test them on a toy dataset like MNIST so that you can get a quick turnaround time. Is there an equivalent dataset for recurrent neural networks? The ideal would be something that can be trained in, say, one hour or less on one GPU. |
|
|
|
|
Tensorflow transfer learning tutorial |
submitted by /u/__The_Coder__ [link] |
|
|
|
|
Clear Capital: Chief Data Scientist/Advanced Analytics Director |
Seeking a Chief Data Scientist/Advanced Analytics Director who is a business minded data scientist with capitalism built into their DNA, a a technology polyglot, fluent in R and SQL + (Java, Python, Scala, NoSQL, etc), who imagines a data model that is widely accepted and highly profitable in our |
|
|
|
|
[D] How does the neuron count of state-of-the-art networks compare with biological brains? |
There is a great Wikipedia page summarising the neuron counts of animals. How do the most successful neural networks compare? I see that a house mouse has around 71 million neurons. |
|
|
|
|
[P] LSTM Lookback Issues |
I'm having a little bit of trouble understanding how to set up LSTM properly. In particular I'm confused on its ability to carry information to future iterations - how does one set how long it carries information (i.e. when it forgets?). |
|
|
|
|
[D] For those of you that work in industry and use machine learning, what ML algorithms do you use and what do you use them for? |
submitted by /u/Gurung11 [link] |
|
|
|
|
MSc in Applied Data Science, Big Data – Online and Part-time |
Data ScienceTech Institute is the 1st private postgraduate school in pure Data Science & Big Data education in France! |
|
|
|
|
[R] Deal or No Deal? End-to-End Learning for Negotiation Dialogues |
submitted by /u/jivatman [link] [comments] |
|
|
|
|
[D] Looking for tools to automatically tag audio sample libraries |
Hello. First of all, I'm not sure if this post fits the rules. I'm just going to go ahead. I have a huge audio library with audio samples of drums I gathered over the years and am looking for a tool with which I can automatically sort these samples. |
|
|
|
|
Chief Analytics Officer, Fall returns bigger and better in 2017 |
Chief Analytics Officer, Oct 2-5 in Boston, will be the largest, most senior gathering of analytics leaders in North America, providing a platform for over 300+ attendees and 125+ speakers to share best practice and explore strategies for driving actionable insights through analytics. |
|
|
|
|
[R] Optimization of Tree Ensembles |
submitted by /u/vvmisic0 [link] [comments]
|
|
|
|
|
The Real “Fear” of AI is Automation Inundation |
The biggest threat to minimum wage earners (and beyond, quite frankly) is the new tsunami of automation in the workplace.
|
|
|
|
|
[R] Variational Approaches for Auto-Encoding Generative Adversarial Networks |
submitted by /u/pauljasek [link] [comments]
|
|
|
|
|
K-means Clustering with Tableau – Call Detail Records Example |
We show how to use Tableau 10 clustering feature to create statistically-based segments that provide insights about similarities in different groups and performance of the groups when compared to each other.
|
|
|
|
|
[N] DeepMind Open Source: Datasets |
submitted by /u/pp314159 [link] |
|
|
|
|
Understanding Deep Learning Requires Re-thinking Generalization |
What is it that distinguishes neural networks that generalize well from those that don’t? A satisfying answer to this question would not only help to make neural networks more interpretable, but it might also lead to more principled and reliable model architecture design.
|
|
|
|
|
[D] Random Effects Neural Networks in Edward and Keras |
submitted by /u/_alphamaximus_ [link] |
|
|
|
|
[N] Google Released MobileNets: Efficient Pre-Trained Tensorflow Computer Vision Models |
submitted by /u/Dutchcheesehead |
|
|
|
|
[D]my algorithm can make sentences give words |
I make an algorithm that takes biomedical words (<13) and make a sentence for you by brute force searching. |
|
|
|
|
[R] Sobolev Training for Neural Networks [DeepMind] |
submitted by /u/asobolev [link] [comments]
|
|
|
|
|
[P] Saliency detection with convolutional autoencoder |
submitted by /u/Seon-Ho [link] |
|
|
|
|
Interesting/unusual way of using machine learning to study microbial ecology |
submitted by /u/benlibb [link] |
|
|
|
|
[D] What Can't Deep Learning Do? |
submitted by /u/visarga [link] |
|
|
|
|
[D] Andrew Rowan - Bayesian Deep Learning with Edward (and a trick using Dropout) |
submitted by /u/_alphamaximus_ [link] |
|
|
|
|
[P] Logical Poet |
https://github.com/seominlee/Logical-Poet submitted by /u/seominlee [link] |
|
|
|
|
[D] Medium is the new method! Has evaluation by image generation mislead researchers and driven GAN research off-track? |
Recently I have been exploring GAN literature, and while it's no doubt that idea behind adversarial training process is interesting, I found the over-reliance on image generation disturbing. |
|
|
|
|
[D] How does licensing work with regards to model architectures and other specific solutions/ideas in publicly-available research papers (arxiv)? |
Usually such information is not mentioned in the body of the text. Is it best (or even wise) to email the authors? Does the answer change if I were to attempt to ship a commercial product with an architecture that some team spent many hours working on? |
|
|
|
|
[R] [1706.04638] Proximal Backpropagation |
submitted by /u/SquirrelNine [link] [comments]
|
|
|
|
|
[P]Chainer implementation of "Attention Is All You Need" |
Hi, I'm working on implementing this paper, https://arxiv.org/pdf/1706.03762.pdf Here's my code at this link: https://github.com/soskek/attention_is_all_you_need These are the area |
|
|
|
|
75 Big Data Terms To Make Your Dad Proud of You on Father's Day |
|
|
|
|
|
[N] Tensorflow v1.2 Released |
submitted by /u/ntenenz [link] |
|
|
|
|
[R] Learning Deep ResNet Blocks Sequentially using Boosting Theory |
submitted by /u/xternalz [link] [comments]
|
|
|
|
|
[N] Supercharge your Computer Vision models with the TensorFlow Object Detection API |
submitted by /u/deepvideoanalytics |
|
|
|
|
[R] PyTorch Implementation of "Principled Detection of Out-of-Distribution Examples in Neural Networks" (UIUC, Cornell) |
submitted by /u/howdygoop [link] |
|
|
|
|
[R] FreezeOut: Accelerate training by up to 20% by progressively freezing layers. Based on a reddit comment and a subsequent 96 hour science binge. |
submitted by /u/ajmooch [link] [comments]
|
|
|
|
|
Sales Data Analysis using DataIku Studio |
Overview
|
|
|
|
|
Tensorflow 1.2 Released |
submitted by /u/ntenenz [link] |
|
|
|
|
[R] Introducing source-contrastive estimation |
submitted by /u/amplifier_khan |
|
|
|
|
[R] [1706.04223] Adversarially Regularized Autoencoders for Generating Discrete Structures |
submitted by /u/evc123 [link] [comments]
|
|
|
|
|
13 Great Blogs Posted in the last 12 Months |
This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. These articles are at least 6 month old but no more than 12 month old. |
|
|
|
|
Staples: Data Scientist |
Seeking a Data Scientist, responsible for applying advanced analytical methods to improve decision-making at Staples, using machine learning, statistical analysis, and mathematical optimization and playing a key role in supporting the development of state-of-the-art solutions.
|
|
|
|
|
[D] implementation of cramer-GAN for celebA |
Does anyone know of a implementation of this that's ready for celebA or any 64x64 image dataset? |
|
|
|
|
What is Data Exhaust and What Can You Do With It? |
Data exhaust? No, not exhaustion from data. Simply put, data exhaust is the data that a business collects that it doesn’t currently think it can put to use. The biggest producers of data exhaust are manufacturers and sometimes retailers. |
|
|
|
|
Career Advice for Students from 2017 Data Science Leaders |
|
|
|
|
|
Machine learning made simple with Apache Spark |
Powered by Apache Spark, Databricks provides an end-to-end platform designed to help data engineers and data scientists easily implement advanced analytics at scale. Download the Making Machine Learning Simple Whitepaper from Databricks to learn more.
|
|
|
|
|
Decision Trees Tutorial |
Would you survive a disaster?
|
|
|
|
|
Thursday News: AI, IoT, Data Science, Machine Learning |
Here is our selection of featured articles and resources posted since Monday: |
|
|
|
|
Hadoop as a Data Warehouse: Cracking the Code with Kudu |
Here we discuss problems behind replacing an existing Data Warehouse with Hadoop and available solutions to make this happen. Lets see how.
|
|
|
|
|
The Terrible Deep Learning List |
submitted by /u/tfzb [link] |
|
|
|
|
Classifying segmented strokes as characters – Part 3 of an XKCD font saga |
In part two of my XKCD font saga I was able to separate strokes from the XKCD handwriting dataset into many smaller images. |
|
|
|
|
The Surprising Complexity of Randomness |
The reason we have pseudorandom numbers is because generating true random numbers using a computer is difficult. Computers, by design, are excellent at taking a set of instructions and carrying them out in the exact same way, every single time.
|
|
|
|
|
[D] Sentiment analysis of Twitter tweets |
What are some good papers on getting started with sentiment analysis of Twitter tweets? I would like to get an understanding of how to classify people's reaction to a tweet. The idea is to follow tweets from organisations on Twitter. This way the initial tweets are at least coherent. |
|
|
|
|
Medical Image Analysis with Deep Learning , Part 3 |
In this article we will focus — basic deep learning using Keras and Theano. We will do 2 examples one using keras for basic predictive analytics and other a simple example of image analysis using VGG.
|
|
|
|
|
Another batch of Think Stats notebooks |
Getting ready to teach Data Science in the spring, I am going back through Think Stats and updating the Jupyter no |
|
|
|
|
[D] Best second language after Python for ML purposes |
I'm fairly comfortable with Python and I've been wanting to learn another language, and I've been wondering what language would be the best choice if my main interest is in machine learning/AI. |
|
|
|
|
[P]Keras implementation of "A simple neural network module for relational reasoning", beats SOTA on Cornell NLVR |
submitted by /u/phreeza [link] |
|
|
|
|
Feasibility of a personal knowledge management system based on statistical analysis and data mining |
We start from the principle that natural language is too complex for computer programs, that’s why it is difficult to have simple programs that can mine natural language in an effective way. |
|
|
|