2016年11月15日 星期二

用 Google Prediction API 來判斷留言者的情緒


繼 IBM Watson 後,今日嘗試了另一個 Machine Learning Framework: Google Preduction API。同樣地,希望能用它來做 Sentiment Analysis,判斷用戶的留言是正評,還是負評。要測試 Google Prediction API 有六個步驟:

1. 在 Google Cloud Console 啟動 Google Prediction API

2. 準備訓練數據。我選用了 https://inclass.kaggle.com/c/si650winter11/data 內的資料。

3. 上傳到 Google Cloud Storage

4. 執行訓練程序:
Request
POST https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels?key={YOUR_API_KEY}
{
   "id": "sentiment",
   "storageDataLocation": "dummy-c15ed.appspot.com/sentiment_training.txt"
}
 
Response
{
   "kind": "prediction#training",
   "id": "sentiment",
   "selfLink": "https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment",
   "storageDataLocation": "dummy-c15ed.appspot.com/sentiment_training.txt"
}

5. 檢查訓練狀態:
Request
GET https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment?key={YOUR_API_KEY}
 
Response
{
   "kind": "prediction#training",
   "id": "sentiment",
   "selfLink": "https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment",
   "created": "2016-11-15T07:15:34.690Z",
   "trainingStatus": "RUNNING"
}

直至完成:
Request
GET https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment?key={YOUR_API_KEY}
 
Response
{
   "kind": "prediction#training",
   "id": "sentiment",
   "selfLink": "https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment",
   "created": "2016-11-15T07:15:34.690Z",
   "trainingComplete": "2016-11-15T07:16:26.026Z",
   "modelInfo": {
      "numberInstances": "7085",
      "modelType": "classification",
      "numberLabels": "2",
      "classificationAccuracy": "0.98"
   },
   "trainingStatus": "DONE"
}

6. 輸入新的留言進行測試:
Request
POST https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment/predict?key={YOUR_API_KEY}
{
   "input": {
      "csvInstance": [
         "This is really a poor product, waste my time!"
      ]
   }
}
 
Response
{
   "kind": "prediction#output",
   "id": "sentiment",
   "selfLink": "https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment/predict",
   "outputLabel": "Negative",
   "outputMulti": [
      {
         "label": "Positive",
         "score": "0.353047"
      }, {
         "label": "Negative",
         "score": "0.646953"
      }
   ]
}
Request
POST https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment/predict?key={YOUR_API_KEY}
{
   "input": {
      "csvInstance": [
         "Pretty cool!  I love it!"
      ]
   }
}
 
Response
{
   "kind": "prediction#output",
   "id": "sentiment",
   "selfLink": "https://www.googleapis.com/prediction/v1.6/projects/dummy-c15ed/trainedmodels/sentiment/predict",
   "outputLabel": "Positive",
   "outputMulti": [
      {
         "label": "Positive",
         "score": "0.998411"
      }, {
         "label": "Negative",
         "score": "0.001589"
      }
   ]
}

由於使用的是英語素材進行訓練,所以測試也要以英文進行。看來效果不錯。下一步是要找出中文素材。

沒有留言: