other posts
-
Mar 19, 2018
Nod-o-Meter
-
Feb 23, 2018
dsp audio amplifier
-
Jan 23, 2018
variational autoencoder interactive demos with deeplearn.js
-
Sep 15, 2017
backpropagation using random feedback weights
-
Jun 16, 2017
quora question pairs
These days I have been spending more time studying statistics and probablity than the papers about new neural network architectures that pop up everyday. to gaining a deeper understanding
To address the issue they developed their own algorithms to detect duplicate question. On top of that, a while ago Quora published their first public dataset of question pairs publicly for machine learning (ML) engineers to see if anyone can come up with a better algorithm to detect duplicate questions, and they created a competition on Kaggle.
Here is how the competition works: ML engineers and ML engineer wannabes -cough-me-cough- who have too much time on their hand (or are not properly supervised at work), download the competition’s training set (which looks like fig. 1) and develop machine learning algorithms that learn by going through the examples in the training set. The training set is usually manually labeled. In this case, 1 or 0 in the is_duplicate column indicates whether the questions are identical. Here the training set contains ~420K question pairs.
id | question1 | question2 | is_duplicate |
---|---|---|---|
1 | Why my answers are collapsed? | Why is my answer collapsed at once? | 0 |
2 | How do I post a question in Quora? | How do I ask a question in Quora? | 1 |
3 | Can I fit my booboos in a 65ml jar? | Is 1 baba worth 55 booboo (おっぱい) ☃? | 0 |