我需要同步产品与ERP,它给我们的产品数据,没有三角洲,每天晚上。所以,我做了一个导入脚本,它实际上运行良好,但它使用的是年龄。有大约2000种产品,所以它不应该花那么长时间,但它需要大约7个小时!我确实要删除产品属性的关联,也要删除丢失的产品,但7小时的时间还是太长了。服务器是可伸缩的可扩展服务器,内存高达8GB。
以下是代码(简化):
/*
* Amount of entities to execute per round.
* */
public const BATCH = 200;
public function execute()
{
// read products from import xml file
$importProducts = $this->loadProducts();
$csvBatch = array_chunk($importProducts, self::BATCH);
$productNumbers = [];
foreach ($csvBatch as $products) {
$productNumbers[] = $this->processImportProducts($products, false);
}
$this->deleteProducts(array_merge(...$productNumbers));
return 0;
}
private function processImportProducts($productsData)
{
$products = [];
$productNumbers = [];
foreach ($productsData as $product) {
$products[$product['ProductNo']] = $this->importProducts($product);
$productNumbers[] = $product['ProductNo'];
}
// upsert product
try {
$this->cleanProductProperties($products, $this->context);
$this->productRepository->upsert(array_values($products), $this->context);
} catch (WriteException $exception) {
$this->logger->info(' ');
$this->logger->info('<error>Products could not be imported. Message: '. $exception->getMessage() .'</error>');
}
unset($products);
return $productNumbers;
}
private function cleanProductProperties($products, $context)
{
$productIds = array_values(array_map(static function($product){
return $product['id'];
}, $products));
$productProperties = $this->productPropertyRepository->searchIds(
(new Criteria())->addFilter(new EqualsAnyFilter('productId', $productIds)),
$context
);
$productRelationsToDelete = [];
foreach($productProperties->getIds() as $productProperty) {
$productRelationsToDelete[] = [
'productId' => $productProperty['product_id'],
'optionId' => $productProperty['property_group_option_id']
];
}
$this->productPropertyRepository->delete($productRelationsToDelete, $context);
unset($productIds, $productProperties, $productRelationsToDelete);
}
private function importProducts($product)
{
// search product by productNumber
$productSearch = $this->productRepository->search(
(new Criteria())->addFilter(new EqualsFilter('productNumber', $productNumber)),
$this->context
);
$existingProduct = $productSearch->getEntities()->first();
if ($existingProduct) {
$productId = $existingProduct->getId();
} else {
$productId = Uuid::randomHex();
}
$productData = [
'id' => $productId,
'productNumber' => $productNumber,
'price' => [
[
'currencyId' => Defaults::CURRENCY,
'gross' => 0,
'net' => 0,
'linked' => true
]
],
'stock' => 99999,
'taxId' => $this->taxId,
'name' => $productNames,
'description' => $productDescriptions
];
return $productData;
}还有Elasticsearch运行,这当然也有影响。有谁知道如何改进吗?例如,是否有一种方法可以在存储库重新插入上禁用Elasticsearch的索引?是否有更好的方法来减少内存的使用?
谢谢你的建议!
发布于 2022-08-08 07:56:50
您是在dev或prod环境中运行这个程序吗?强烈建议您在prod环境中执行这样的大规模数据操作。
通过异步索引数据,可以大大加快进程的速度。这将填充消息队列。实体通常会以这种方式以更快的速度持久化,但在队列中的消息被处理之前,可能仍然缺少一些基本数据。通过设置上下文状态,可以设置为队列中的索引数据:
$context->addState(EntityIndexerRegistry::USE_INDEXING_QUEUE);您也可以完全禁用索引,但不建议这样做:
$context->addState(EntityIndexerRegistry::DISABLE_INDEXING);另一种选择是跳过特定的索引器。如果您确实知道不需要索引特定数据,则可以将其设置为跳过整个实体索引器或特定更新程序。
$context->addExtension(EntityIndexerRegistry::EXTENSION_INDEXER_SKIP, new ArrayEntity(['category.indexer']));https://stackoverflow.com/questions/73274264
复制相似问题