Classifier
套件會處理監督式分類,方法是透過在 Earth Engine 中執行的傳統 ML 演算法。這些分類器包括 CART、RandomForest、NaiveBayes 和 SVM。分類的一般工作流程如下:
- 收集訓練資料。組裝具有屬性的特徵,這些屬性會儲存已知的類別標籤,以及儲存預測因子的數值屬性。
- 建立分類器例項。視需要設定參數。
- 使用訓練資料訓練分類器。
- 分類圖片或特徵集合。
- 使用獨立驗證資料估算分類錯誤。
訓練資料是 FeatureCollection
,其中包含儲存類別標籤的屬性,以及儲存預測變數的屬性。類別標籤應為連續整數,從 0 開始。如有需要,請使用 remap()
將類別值轉換為連續整數。預測因子應為數值。
訓練和/或驗證資料可來自多種來源。如要在 Earth Engine 中以互動方式收集訓練資料,可以使用幾何圖形繪製工具 (請參閱程式碼編輯器頁面的幾何圖形工具部分)。或者,您也可以從 Earth Engine 資料表資產匯入預先定義的訓練資料 (詳情請參閱「匯入資料表資料」頁面)。從 ee.Classifier
的其中一個建構函式取得分類器。使用 classifier.train()
訓練分類器。使用 classify()
分類 Image
或 FeatureCollection
。以下範例使用分類與迴歸樹 (CART) 分類器 (Breiman 等人,1984 年) 預測三種簡單類別:
程式碼編輯器 (JavaScript)
// Define a function that scales and masks Landsat 8 surface reflectance images. function prepSrL8(image) { // Develop masks for unwanted pixels (fill, cloud, cloud shadow). var qaMask = image.select('QA_PIXEL').bitwiseAnd(parseInt('11111', 2)).eq(0); var saturationMask = image.select('QA_RADSAT').eq(0); // Apply the scaling factors to the appropriate bands. var getFactorImg = function(factorNames) { var factorList = image.toDictionary().select(factorNames).values(); return ee.Image.constant(factorList); }; var scaleImg = getFactorImg([ 'REFLECTANCE_MULT_BAND_.|TEMPERATURE_MULT_BAND_ST_B10']); var offsetImg = getFactorImg([ 'REFLECTANCE_ADD_BAND_.|TEMPERATURE_ADD_BAND_ST_B10']); var scaled = image.select('SR_B.|ST_B10').multiply(scaleImg).add(offsetImg); // Replace original bands with scaled bands and apply masks. return image.addBands(scaled, null, true) .updateMask(qaMask).updateMask(saturationMask); } // Make a cloud-free Landsat 8 surface reflectance composite. var image = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') .filterDate('2021-03-01', '2021-07-01') .map(prepSrL8) .median(); // Use these bands for prediction. var bands = ['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7', 'ST_B10']; // Load training points. The numeric property 'class' stores known labels. var points = ee.FeatureCollection('GOOGLE/EE/DEMOS/demo_landcover_labels'); // This property stores the land cover labels as consecutive // integers starting from zero. var label = 'landcover'; // Overlay the points on the imagery to get training. var training = image.select(bands).sampleRegions({ collection: points, properties: [label], scale: 30 }); // Train a CART classifier with default parameters. var trained = ee.Classifier.smileCart().train(training, label, bands); // Classify the image with the same bands used for training. var classified = image.select(bands).classify(trained); // Display the inputs and the results. Map.setCenter(-122.0877, 37.7880, 11); Map.addLayer(image, {bands: ['SR_B4', 'SR_B3', 'SR_B2'], min: 0, max: 0.25}, 'image'); Map.addLayer(classified, {min: 0, max: 2, palette: ['orange', 'green', 'blue']}, 'classification');
import ee import geemap.core as geemap
Colab (Python)
# Define a function that scales and masks Landsat 8 surface reflectance images. def prep_sr_l8(image): """Scales and masks Landsat 8 surface reflectance images.""" # Develop masks for unwanted pixels (fill, cloud, cloud shadow). qa_mask = image.select('QA_PIXEL').bitwiseAnd(0b11111).eq(0) saturation_mask = image.select('QA_RADSAT').eq(0) # Apply the scaling factors to the appropriate bands. def _get_factor_img(factor_names): factor_list = image.toDictionary().select(factor_names).values() return ee.Image.constant(factor_list) scale_img = _get_factor_img([ 'REFLECTANCE_MULT_BAND_.|TEMPERATURE_MULT_BAND_ST_B10']) offset_img = _get_factor_img([ 'REFLECTANCE_ADD_BAND_.|TEMPERATURE_ADD_BAND_ST_B10']) scaled = image.select('SR_B.|ST_B10').multiply(scale_img).add(offset_img) # Replace original bands with scaled bands and apply masks. return image.addBands(scaled, None, True).updateMask( qa_mask).updateMask(saturation_mask) # Make a cloud-free Landsat 8 surface reflectance composite. l8_image = ( ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') .filterDate('2021-03-01', '2021-07-01') .map(prep_sr_l8) .median()) # Use these bands for prediction. bands = ['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7', 'ST_B10'] # Load training points. The numeric property 'class' stores known labels. points = ee.FeatureCollection('GOOGLE/EE/DEMOS/demo_landcover_labels') # This property stores the land cover labels as consecutive # integers starting from zero. label = 'landcover' # Overlay the points on the imagery to get training. training = l8_image.select(bands).sampleRegions( collection=points, properties=[label], scale=30 ) # Train a CART classifier with default parameters. trained = ee.Classifier.smileCart().train(training, label, bands) # Classify the image with the same bands used for training. classified = l8_image.select(bands).classify(trained) # Display the inputs and the results. m = geemap.Map() m.set_center(-122.0877, 37.7880, 11) m.add_layer( l8_image, {'bands': ['SR_B4', 'SR_B3', 'SR_B2'], 'min': 0, 'max': 0.25}, 'image', ) m.add_layer( classified, {'min': 0, 'max': 2, 'palette': ['orange', 'green', 'blue']}, 'classification', ) m
在本例中,資料表中的訓練點只會儲存類別標籤。請注意,訓練屬性 ('landcover'
) 會儲存從 0 開始的連續整數 (如有需要,請在表格中使用 remap()
,將類別標籤轉換為從零開始的連續整數)。另請注意,使用 image.sampleRegions()
將預測因子放入資料表,並建立訓練資料集。如要訓練分類器,請指定類別標籤屬性的名稱,以及訓練資料表中的屬性清單,分類器應將這些屬性用於預測因子。待分類圖片中的波段數量和順序,必須與提供給 classifier.train()
的屬性清單順序完全一致。使用 image.select()
確保分類器結構定義與圖片相符。
如果訓練資料是代表同質區域的多邊形,則每個多邊形中的每個像素都是訓練點。您可以如以下範例所示,使用多邊形進行訓練:
程式碼編輯器 (JavaScript)
// Define a function that scales and masks Landsat 8 surface reflectance images. function prepSrL8(image) { // Develop masks for unwanted pixels (fill, cloud, cloud shadow). var qaMask = image.select('QA_PIXEL').bitwiseAnd(parseInt('11111', 2)).eq(0); var saturationMask = image.select('QA_RADSAT').eq(0); // Apply the scaling factors to the appropriate bands. var getFactorImg = function(factorNames) { var factorList = image.toDictionary().select(factorNames).values(); return ee.Image.constant(factorList); }; var scaleImg = getFactorImg([ 'REFLECTANCE_MULT_BAND_.|TEMPERATURE_MULT_BAND_ST_B10']); var offsetImg = getFactorImg([ 'REFLECTANCE_ADD_BAND_.|TEMPERATURE_ADD_BAND_ST_B10']); var scaled = image.select('SR_B.|ST_B10').multiply(scaleImg).add(offsetImg); // Replace original bands with scaled bands and apply masks. return image.addBands(scaled, null, true) .updateMask(qaMask).updateMask(saturationMask); } // Make a cloud-free Landsat 8 surface reflectance composite. var image = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') .filterDate('2018-01-01', '2019-01-01') .map(prepSrL8) .median(); // Use these bands for prediction. var bands = ['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7']; // Manually created polygons. var forest1 = ee.Geometry.Rectangle(-63.0187, -9.3958, -62.9793, -9.3443); var forest2 = ee.Geometry.Rectangle(-62.8145, -9.206, -62.7688, -9.1735); var nonForest1 = ee.Geometry.Rectangle(-62.8161, -9.5001, -62.7921, -9.4486); var nonForest2 = ee.Geometry.Rectangle(-62.6788, -9.044, -62.6459, -8.9986); // Make a FeatureCollection from the hand-made geometries. var polygons = ee.FeatureCollection([ ee.Feature(nonForest1, {'class': 0}), ee.Feature(nonForest2, {'class': 0}), ee.Feature(forest1, {'class': 1}), ee.Feature(forest2, {'class': 1}), ]); // Get the values for all pixels in each polygon in the training. var training = image.sampleRegions({ // Get the sample from the polygons FeatureCollection. collection: polygons, // Keep this list of properties from the polygons. properties: ['class'], // Set the scale to get Landsat pixels in the polygons. scale: 30 }); // Create an SVM classifier with custom parameters. var classifier = ee.Classifier.libsvm({ kernelType: 'RBF', gamma: 0.5, cost: 10 }); // Train the classifier. var trained = classifier.train(training, 'class', bands); // Classify the image. var classified = image.classify(trained); // Display the classification result and the input image. Map.setCenter(-62.836, -9.2399, 9); Map.addLayer(image, {bands: ['SR_B4', 'SR_B3', 'SR_B2'], min: 0, max: 0.25}, 'image'); Map.addLayer(polygons, {color: 'yellow'}, 'training polygons'); Map.addLayer(classified, {min: 0, max: 1, palette: ['orange', 'green']}, 'deforestation');
import ee import geemap.core as geemap
Colab (Python)
# Define a function that scales and masks Landsat 8 surface reflectance images. def prep_sr_l8(image): # Develop masks for unwanted pixels (fill, cloud, cloud shadow). qa_mask = image.select('QA_PIXEL').bitwiseAnd(0b11111).eq(0) saturation_mask = image.select('QA_RADSAT').eq(0) # Apply the scaling factors to the appropriate bands. def _get_factor_img(factor_names): factor_list = image.toDictionary().select(factor_names).values() return ee.Image.constant(factor_list) scale_img = _get_factor_img([ 'REFLECTANCE_MULT_BAND_.|TEMPERATURE_MULT_BAND_ST_B10']) offset_img = _get_factor_img([ 'REFLECTANCE_ADD_BAND_.|TEMPERATURE_ADD_BAND_ST_B10']) scaled = image.select('SR_B.|ST_B10').multiply(scale_img).add(offset_img) # Replace original bands with scaled bands and apply masks. return image.addBands(scaled, None, True).updateMask( qa_mask).updateMask(saturation_mask) # Make a cloud-free Landsat 8 surface reflectance composite. l8_image = ( ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') .filterDate('2018-01-01', '2019-01-01') .map(prep_sr_l8) .median()) # Use these bands for prediction. bands = ['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7'] # Manually created polygons. forest1 = ee.Geometry.Rectangle(-63.0187, -9.3958, -62.9793, -9.3443) forest2 = ee.Geometry.Rectangle(-62.8145, -9.206, -62.7688, -9.1735) non_forest1 = ee.Geometry.Rectangle(-62.8161, -9.5001, -62.7921, -9.4486) non_forest2 = ee.Geometry.Rectangle(-62.6788, -9.044, -62.6459, -8.9986) # Make a FeatureCollection from the hand-made geometries. polygons = ee.FeatureCollection([ ee.Feature(non_forest1, {'class': 0}), ee.Feature(non_forest1, {'class': 0}), ee.Feature(forest1, {'class': 1}), ee.Feature(forest2, {'class': 1}), ]) # Get the values for all pixels in each polygon in the training. training = l8_image.sampleRegions( # Get the sample from the polygons FeatureCollection. collection=polygons, # Keep this list of properties from the polygons. properties=['class'], # Set the scale to get Landsat pixels in the polygons. scale=30, ) # Create an SVM classifier with custom parameters. classifier = ee.Classifier.libsvm(kernelType='RBF', gamma=0.5, cost=10) # Train the classifier. trained = classifier.train(training, 'class', bands) # Classify the image. classified = l8_image.classify(trained) # Display the classification result and the input image. m = geemap.Map() m.set_center(-62.836, -9.2399, 9) m.add_layer( l8_image, {'bands': ['SR_B4', 'SR_B3', 'SR_B2'], 'min': 0, 'max': 0.25}, 'image', ) m.add_layer(polygons, {'color': 'yellow'}, 'training polygons') m.add_layer( classified, {'min': 0, 'max': 1, 'palette': ['orange', 'green']}, 'deforestation', ) m
這個範例使用支援向量機 (SVM) 分類器 (Burges 1998)。請注意,SVM 是以一組自訂參數指定。如果沒有預測問題的物理本質相關先驗資訊,就無法得知最佳參數。如需選擇 SVM 參數的粗略指南,請參閱 Hsu 等人 (2003)。
分類器輸出模式
ee.Classifier.setOutputMode()
方法可控制監督式分類結果的格式,讓輸出內容以幾種不同的方式呈現:
- 分類 (預設):輸出內容為類別編號。
- REGRESSION:輸出內容是標準迴歸的結果。
- 機率:輸出結果是分類正確的機率。
- MULTIPROBABILITY:輸出內容為各類別正確的機率陣列,並依所見類別排序。
- RAW:輸出內容是分類程序內部表示法的陣列。例如多決策樹模型中的原始票數。
- RAW_REGRESSION:輸出內容是回歸程序內部表示法的陣列。例如多個迴歸樹的原始預測值。
支援這些輸出模式的情況各有不同。下表摘要列出每個分類器支援的模式。
分類器 | 分類 | 迴歸 | PROBABILITY | MULTIPROBABILITY | RAW | RAW_REGRESSION |
---|---|---|---|---|---|---|
ee.Classifier.amnhMaxent | ||||||
ee.Classifier.minimumDistance | ||||||
ee.Classifier.smileCart | ||||||
ee.Classifier.smileGradientTreeBoost | ||||||
ee.Classifier.smileKNN | ||||||
ee.Classifier.smileNaiveBayes | ||||||
ee.Classifier.smileRandomForest | ||||||
ee.Classifier.libsvm C_SVC | ||||||
ee.Classifier.libsvm NU_SVC | ||||||
ee.Classifier.libsvm ONE_CLASS | ||||||
ee.Classifier.libsvm EPSILON_SVR | ||||||
ee.Classifier.libsvm NU_SVR |
訓練分類器前,請先使用 setOutputMode()
定義輸出格式。
舉例來說,您可以將前一個程式碼區塊中的 SVM 分類器設定為輸出機率,而非預設分類標籤:
程式碼編輯器 (JavaScript)
var classifier = ee.Classifier.libsvm({ kernelType: 'RBF', gamma: 0.5, cost: 10 }).setOutputMode('PROBABILITY'); var trained = classifier.train(training, 'class', bands);
import ee import geemap.core as geemap
Colab (Python)
classifier = ee.Classifier.libsvm( kernelType='RBF', gamma=0.5, cost=10 ).setOutputMode('PROBABILITY') trained = classifier.train(training, 'class', bands)
準確度評估
如要評估分類器的準確度,請使用 ConfusionMatrix
(Stehman 1997)。以下範例使用 sample()
從 MODIS 參考圖片產生訓練和驗證資料,並比較代表訓練和驗證準確度的混淆矩陣:
程式碼編輯器 (JavaScript)
// Define a region of interest. var roi = ee.Geometry.BBox(-122.93, 36.99, -121.20, 38.16); // Define a function that scales and masks Landsat 8 surface reflectance images. function prepSrL8(image) { // Develop masks for unwanted pixels (fill, cloud, cloud shadow). var qaMask = image.select('QA_PIXEL').bitwiseAnd(parseInt('11111', 2)).eq(0); var saturationMask = image.select('QA_RADSAT').eq(0); // Apply the scaling factors to the appropriate bands. var getFactorImg = function(factorNames) { var factorList = image.toDictionary().select(factorNames).values(); return ee.Image.constant(factorList); }; var scaleImg = getFactorImg([ 'REFLECTANCE_MULT_BAND_.|TEMPERATURE_MULT_BAND_ST_B10']); var offsetImg = getFactorImg([ 'REFLECTANCE_ADD_BAND_.|TEMPERATURE_ADD_BAND_ST_B10']); var scaled = image.select('SR_B.|ST_B10').multiply(scaleImg).add(offsetImg); // Replace original bands with scaled bands and apply masks. return image.addBands(scaled, null, true) .updateMask(qaMask).updateMask(saturationMask); } // Make a cloud-free Landsat 8 surface reflectance composite. var input = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') .filterBounds(roi) .filterDate('2020-03-01', '2020-07-01') .map(prepSrL8) .median() .setDefaultProjection('EPSG:4326', null, 30) .select(['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7']); // Use MODIS land cover, IGBP classification, for training. var modis = ee.Image('MODIS/006/MCD12Q1/2020_01_01') .select('LC_Type1'); // Sample the input imagery to get a FeatureCollection of training data. var training = input.addBands(modis).sample({ region: roi, numPixels: 5000, seed: 0 }); // Make a Random Forest classifier and train it. var classifier = ee.Classifier.smileRandomForest(10) .train({ features: training, classProperty: 'LC_Type1', inputProperties: ['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7'] }); // Classify the input imagery. var classified = input.classify(classifier); // Get a confusion matrix representing resubstitution accuracy. var trainAccuracy = classifier.confusionMatrix(); print('Resubstitution error matrix: ', trainAccuracy); print('Training overall accuracy: ', trainAccuracy.accuracy()); // Sample the input with a different random seed to get validation data. var validation = input.addBands(modis).sample({ region: roi, numPixels: 5000, seed: 1 // Filter the result to get rid of any null pixels. }).filter(ee.Filter.notNull(input.bandNames())); // Classify the validation data. var validated = validation.classify(classifier); // Get a confusion matrix representing expected accuracy. var testAccuracy = validated.errorMatrix('LC_Type1', 'classification'); print('Validation error matrix: ', testAccuracy); print('Validation overall accuracy: ', testAccuracy.accuracy()); // Define a palette for the IGBP classification. var igbpPalette = [ 'aec3d4', // water '152106', '225129', '369b47', '30eb5b', '387242', // forest '6a2325', 'c3aa69', 'b76031', 'd9903d', '91af40', // shrub, grass '111149', // wetlands 'cdb33b', // croplands 'cc0013', // urban '33280d', // crop mosaic 'd7cdcc', // snow and ice 'f7e084', // barren '6f6f6f' // tundra ]; // Display the input and the classification. Map.centerObject(roi, 10); Map.addLayer(input.clip(roi), {bands: ['SR_B4', 'SR_B3', 'SR_B2'], min: 0, max: 0.25}, 'landsat'); Map.addLayer(classified.clip(roi), {palette: igbpPalette, min: 0, max: 17}, 'classification');
import ee import geemap.core as geemap
Colab (Python)
# Define a region of interest. roi = ee.Geometry.BBox(-122.93, 36.99, -121.20, 38.16) # Define a function that scales and masks Landsat 8 surface reflectance images. def prep_sr_l8(image): """Scales and masks Landsat 8 surface reflectance images.""" # Develop masks for unwanted pixels (fill, cloud, cloud shadow). qa_mask = image.select('QA_PIXEL').bitwiseAnd(0b1111).eq(0) saturation_mask = image.select('QA_RADSAT').eq(0) # Apply the scaling factors to the appropriate bands. def _get_factor_img(factor_names): factor_list = image.toDictionary().select(factor_names).values() return ee.Image.constant(factor_list) scale_img = _get_factor_img([ 'REFLECTANCE_MULT_BAND_.|TEMPERATURE_MULT_BAND_ST_B10']) offset_img = _get_factor_img([ 'REFLECTANCE_ADD_BAND_.|TEMPERATURE_ADD_BAND_ST_B10']) scaled = image.select('SR_B.|ST_B10').multiply(scale_img).add(offset_img) # Replace original bands with scaled bands and apply masks. return image.addBands(scaled, None, True).updateMask( qa_mask).updateMask(saturation_mask) # Make a cloud-free Landsat 8 surface reflectance composite. input_image = ( ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') .filterBounds(roi) .filterDate('2020-03-01', '2020-07-01') .map(prep_sr_l8) .median() .setDefaultProjection('EPSG:4326', None, 30) .select(['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7']) ) # Use MODIS land cover, IGBP classification, for training. modis = ee.Image('MODIS/006/MCD12Q1/2020_01_01').select('LC_Type1') # Sample the input imagery to get a FeatureCollection of training data. training = input_image.addBands(modis).sample( region=roi, numPixels=5000, seed=0 ) # Make a Random Forest classifier and train it. classifier = ee.Classifier.smileRandomForest(10).train( features=training, classProperty='LC_Type1', inputProperties=['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7'], ) # Classify the input imagery. classified = input_image.classify(classifier) # Get a confusion matrix representing resubstitution accuracy. train_accuracy = classifier.confusionMatrix() display('Resubstitution error matrix:', train_accuracy) display('Training overall accuracy:', train_accuracy.accuracy()) # Sample the input with a different random seed to get validation data. validation = ( input_image.addBands(modis) .sample( region=roi, numPixels=5000, seed=1, # Filter the result to get rid of any null pixels. ) .filter(ee.Filter.notNull(input_image.bandNames())) ) # Classify the validation data. validated = validation.classify(classifier) # Get a confusion matrix representing expected accuracy. test_accuracy = validated.errorMatrix('LC_Type1', 'classification') display('Validation error matrix:', test_accuracy) display('Validation overall accuracy:', test_accuracy.accuracy()) # Define a palette for the IGBP classification. igbp_palette = [ 'aec3d4', # water '152106', '225129', '369b47', '30eb5b', '387242', # forest '6a2325', 'c3aa69', 'b76031', 'd9903d', '91af40', # shrub, grass '111149', # wetlands 'cdb33b', # croplands 'cc0013', # urban '33280d', # crop mosaic 'd7cdcc', # snow and ice 'f7e084', # barren '6f6f6f' # tundra ] # Display the input and the classification with geemap in a notebook. m = geemap.Map() m.center_object(roi, 10) m.add_layer( input_image.clip(roi), {'bands': ['SR_B4', 'SR_B3', 'SR_B2'], 'min': 0, 'max': 0.25}, 'landsat', ) m.add_layer( classified.clip(roi), {'palette': igbp_palette, 'min': 0, 'max': 17}, 'classification', ) m
這個範例使用隨機森林 (Breiman 2001) 分類器和 10 棵樹,將 MODIS 資料縮放至 Landsat 解析度。sample()
方法會從 MODIS 資料產生兩個隨機樣本:一個用於訓練,另一個用於驗證。訓練樣本用於訓練分類器。
您可以從 classifier.confusionMatrix()
取得訓練資料的重新代入準確率。如要取得驗證準確度,請分類驗證資料。這會在驗證 FeatureCollection
中新增 classification
屬性。在分類的 errorMatrix()
上呼叫 FeatureCollection
,取得代表驗證 (預期) 準確度的混淆矩陣。
檢查輸出內容,確認從訓練資料估算的整體準確率遠高於驗證資料。從訓練資料估算的準確度會高估,因為隨機森林是「配適」訓練資料。如驗證資料的估計結果所示,模型分類未知資料的準確率較低。
您也可以擷取單一樣本,並使用特徵集合的 randomColumn()
方法進行分割。接續上一個範例:
程式碼編輯器 (JavaScript)
var sample = input.addBands(modis).sample({ region: roi, numPixels: 5000, seed: 0 }); // The randomColumn() method will add a column of uniform random // numbers in a column named 'random' by default. sample = sample.randomColumn(); var split = 0.7; // Roughly 70% training, 30% testing. var training = sample.filter(ee.Filter.lt('random', split)); var validation = sample.filter(ee.Filter.gte('random', split));
import ee import geemap.core as geemap
Colab (Python)
sample = input_image.addBands(modis).sample(region=roi, numPixels=5000, seed=0) # The randomColumn() method will add a column of uniform random # numbers in a column named 'random' by default. sample = sample.randomColumn() split = 0.7 # Roughly 70% training, 30% testing. training = sample.filter(ee.Filter.lt('random', split)) validation = sample.filter(ee.Filter.gte('random', split))
您可能也想確保訓練樣本與評估樣本不相關。這可能是因為要預測的現象具有空間自相關性。 如要排除可能以這種方式相互關聯的樣本,其中一種方法是移除與任何其他樣本距離在某個範圍內的樣本。這項操作可透過空間聯結完成:
程式碼編輯器 (JavaScript)
// Sample the input imagery to get a FeatureCollection of training data. var sample = input.addBands(modis).sample({ region: roi, numPixels: 5000, seed: 0, geometries: true, tileScale: 16 }); // The randomColumn() method will add a column of uniform random // numbers in a column named 'random' by default. sample = sample.randomColumn(); var split = 0.7; // Roughly 70% training, 30% testing. var training = sample.filter(ee.Filter.lt('random', split)); print('Training size:', training.size()); var validation = sample.filter(ee.Filter.gte('random', split)); // Spatial join. var distFilter = ee.Filter.withinDistance({ distance: 1000, leftField: '.geo', rightField: '.geo', maxError: 10 }); var join = ee.Join.inverted(); // Apply the join. training = join.apply(training, validation, distFilter); print('Training size after spatial filtering:', training.size());
import ee import geemap.core as geemap
Colab (Python)
# Sample the input imagery to get a FeatureCollection of training data. sample = input_image.addBands(modis).sample( region=roi, numPixels=5000, seed=0, geometries=True, tileScale=16 ) # The randomColumn() method will add a column of uniform random # numbers in a column named 'random' by default. sample = sample.randomColumn() split = 0.7 # Roughly 70% training, 30% testing. training = sample.filter(ee.Filter.lt('random', split)) display('Training size:', training.size()) validation = sample.filter(ee.Filter.gte('random', split)) # Spatial join. dist_filter = ee.Filter.withinDistance( distance=1000, leftField='.geo', rightField='.geo', maxError=10 ) join = ee.Join.inverted() # Apply the join. training = join.apply(training, validation, dist_filter) display('Training size after spatial filtering:', training.size())
請注意,在前一個程式碼片段中,geometries
會在 sample()
中設為 true
。這是為了保留空間聯結所需的樣本點空間資訊。另請注意,tileScale
已設為 16
。
這是為了避免發生「超過使用者記憶體限制」錯誤。
正在儲存分類器
如果輸入資料過大 (超過 99 MB) 或訓練時間過長 (5 分鐘),可能無法以互動方式訓練分類器。使用 Export.classifier.toAsset
以批次工作形式執行分類器訓練,這樣就能在更多記憶體支援下長時間執行。您可以匯出並重新載入訓練成本高昂的分類器,避免重新訓練。
程式碼編輯器 (JavaScript)
// Using the random forest classifier defined earlier, export the random // forest classifier as an Earth Engine asset. var classifierAssetId = 'projects/<PROJECT-ID>/assets/upscaled_MCD12Q1_random_forest'; Export.classifier.toAsset( classifier, 'Saved-random-forest-IGBP-classification', classifierAssetId );
import ee import geemap.core as geemap
Colab (Python)
# Using the random forest classifier defined earlier, export the random # forest classifier as an Earth Engine asset. classifier_asset_id = ( 'projects/<PROJECT-ID>/assets/upscaled_MCD12Q1_random_forest' ) task = ee.batch.Export.classifier.toAsset( classifier, 'Saved-random-forest-IGBP-classification', classifier_asset_id ) task.start()
如要載入已儲存的分類器,請使用 ee.Classifier.load()
演算法,指定匯出的分類器 ID,並像使用任何其他訓練過的分類器一樣使用。
程式碼編輯器 (JavaScript)
// Once the classifier export finishes, we can load our saved classifier. var savedClassifier = ee.Classifier.load(classifierAssetId); // We can perform classification just as before with the saved classifier now. var classified = input.classify(savedClassifier); Map.addLayer(classified.clip(roi), {palette: igbpPalette, min: 0, max: 17}, 'classification');
import ee import geemap.core as geemap
Colab (Python)
# Once the classifier export finishes, we can load our saved classifier. saved_classifier = ee.Classifier.load(classifier_asset_id) # We can perform classification just as before with the saved classifier now. classified = input_image.classify(saved_classifier) m = geemap.Map() m.center_object(roi, 10) m.add_layer( classified.clip(roi), {'palette': igbp_palette, 'min': 0, 'max': 17}, 'classification', ) m