RError.com

RError.com Logo RError.com Logo

RError.com Navigation

  • 主页

Mobile menu

Close
  • 主页
  • 系统&网络
    • 热门问题
    • 最新问题
    • 标签
  • Ubuntu
    • 热门问题
    • 最新问题
    • 标签
  • 帮助
主页 / 问题 / 1203765
Accepted
Vladimir
Vladimir
Asked:2021-11-13 00:27:37 +0000 UTC2021-11-13 00:27:37 +0000 UTC 2021-11-13 00:27:37 +0000 UTC

从一个块中获取数据的问题

  • 772

有一个ParseItem函数用于解析产品,例如:Product 1,Product 2。

在这里,作为一个例子,产品 2将被解析。

        private async static Task ParseItem(string itemUrl)
        {
            HttpClient client = new HttpClient();
            client.DefaultRequestHeaders.Add("User-Agent", "C# App");
            HttpResponseMessage responce = await client.GetAsync(itemUrl);
            string source = default;
            if (responce != null && responce.StatusCode == HttpStatusCode.OK)
            {
                source = await responce.Content.ReadAsStringAsync();
            }

            HtmlParser domParser = new HtmlParser();
            IHtmlDocument document = await domParser.ParseDocumentAsync(source);

            string name = document.QuerySelector("[data-qaid|=product_name]")?.InnerHtml;
            string price = document.QuerySelector("[data-qaid|=product_price]")?.Text();
            string code = document.QuerySelector("[data-qaid|=product-sku]")?.Text();
            string characteristics = document.QuerySelector("[data-qaid|=attributes]")?.Text();//Характеристики
            string company_name = document.QuerySelector("[data-qaid|=company_name]")?.InnerHtml;

            //Не могу получить данные переменные всегда будут == null
            string info_by_company = document.QuerySelector("[data-qaid|=info_by_company]")?.InnerHtml;
            var company_location = document.QuerySelector("[data-qaid|=company_location]");
            var phone = document.QuerySelector("[data-qaid|=phone]");
            var site = document.QuerySelector("[data-qaid|=site]");
            var schedule_block = document.QuerySelector("[data-qaid|=schedule_block]");
        }

        static async Task Main()
        {

            await ParseItem("https://prom.ua/p1203034278-nabor-dlya-uhoda.html");
            Console.ReadKey();
        }

问题:我尝试解析每个产品,问题总是一样,我无法从图1 的块中获取数据

c#
  • 1 1 个回答
  • 10 Views

1 个回答

  • Voted
  1. Best Answer
    Vladimir
    2021-11-13T01:56:37Z2021-11-13T01:56:37Z

    json.linq

    using Newtonsoft.Json.Linq;
    using System;
    using System.Linq;
    using System.Net.Http;
    using System.Text;
    using System.Threading.Tasks;
    
    static class Ext
    {
        public static string SubstringJson(this string value, string start, string end)
        {
            var startIndex = value.IndexOf(start) + start.Length;
            var endIndex = value.IndexOf(end) - startIndex;
    
            var result = value.Substring(startIndex, endIndex).Trim();
            return result.Remove(result.Length - 1);
        }
    
        public static int SubstringId(this string url)
        {
            var result = new Uri(url).AbsolutePath.Split('-').FirstOrDefault()?.Remove(0, 2);
            return int.Parse(result);
        }
    }
    
    class Program
    {
        static readonly HttpClient client = new HttpClient();
        static async Task Main(string[] args)
        {
    
            var url = "https://prom.ua/p1180330579-kabel-multimedijnyj-minidisplay.html";
            var id = url.SubstringId();
    
            var json = await GetJsonAsync(url);
            var product = json[$"Product:{id}"];
    
            var name = product["name"];
            var price = product["price"];
            var priceLocalized = product["priceCurrencyLocalized"];
            var priceUSD = product["priceUSD"];
    
            var companyId = product["company"]["id"];
            var company = json[$"{companyId}"];
            var companyName = company["name"];
            var companyCity = company["city"];
            var companyPerson = company["contactPerson"];
            var companyPersonPhone = company["phone"];
    
    
            var builder = new StringBuilder();
            builder.AppendLine($"{name}");
            builder.AppendLine($"Цена {price} {priceLocalized} ({priceUSD} usd.) ");
            builder.AppendLine($"Компания: {companyName} (г.{companyCity})");
            builder.AppendLine($"Контакт: {companyPerson} ({companyPersonPhone})");
                    
            Console.WriteLine(builder);
        }
    
        private static async Task<JObject> GetJsonAsync(string url)
        {
            var html = await client.GetStringAsync(url);
            var jsonString = html.SubstringJson("window.ApolloCacheState =", "window.SPAConfig");
            return JObject.Parse(jsonString);
        }    
    }
    

    反序列化一个对象

    产品

    public class Product
        {
            [JsonProperty("id")]
            public long Id { get; set; }
    
            [JsonProperty("name")]
            public string Name { get; set; }
    
            [JsonProperty("price")]
            [JsonConverter(typeof(ParseStringConverter))]
            public long Price { get; set; }
    
            [JsonProperty("priceUSD")]
            public string PriceUsd { get; set; }
    
            [JsonProperty("priceCurrency")]
            public string PriceCurrency { get; set; }
    
            [JsonProperty("priceCurrencyLocalized")]
            public string PriceCurrencyLocalized { get; set; }
    
            [JsonProperty("discountedPrice")]
            [JsonConverter(typeof(ParseStringConverter))]
            public long DiscountedPrice { get; set; }
            /// <summary>
            /// Есть скидка.
            /// </summary>
            [JsonProperty("hasDiscount")]
            public bool hasDiscount { get; set; }
    
            [JsonProperty("priceOriginal")]
            [JsonConverter(typeof(ParseStringConverter))]
            public long PriceOriginal { get; set; }
    
            [JsonProperty("sku")]
            public string Sku { get; set; }
    
    
            [JsonProperty("images({\"height\":640,\"width\":640})")]
            public Images Images { get; set; }
    
            
            [JsonProperty("categoryId")]
            public long CategoryId { get; set; }
    
            [JsonProperty("urlForCanonical")]
            public Uri Catalog { get; set; }
    
            [JsonProperty("keywords")]
            public string Keywords { get; set; }
    
            [JsonProperty("urlForProductCatalog")]
            public Uri ProductCatalog { get; set; }
    
            [JsonProperty("descriptionFull")]
            public string DescriptionFull { get; set; }
    
            [JsonProperty("descriptionPlain")]
            public string DescriptionPlain { get; set; }
    
            [JsonProperty("company_id")]
            public long CompanyId { get; set; }
    
            
            [JsonProperty("groupId")]
            public long GroupId { get; set; }
        }
    
       
        public  class Images
        {
            [JsonProperty("type")]
            public string Type { get; set; }
    
            [JsonProperty("json")]
            public Uri[] Json { get; set; }
        }
    

    转换器

    public enum TypeEnum { Id };
    
        internal static class Converter
        {
            public static readonly JsonSerializerSettings Settings = new JsonSerializerSettings
            {
                MetadataPropertyHandling = MetadataPropertyHandling.Ignore,
                DateParseHandling = DateParseHandling.None,
                Converters =
                {
                    TypeEnumConverter.Singleton,
                    new IsoDateTimeConverter { DateTimeStyles = DateTimeStyles.AssumeUniversal }
                },
            };
        }
    
        internal class TypeEnumConverter : JsonConverter
        {
            public override bool CanConvert(Type t) => t == typeof(TypeEnum) || t == typeof(TypeEnum?);
    
            public override object ReadJson(JsonReader reader, Type t, object existingValue, JsonSerializer serializer)
            {
                if (reader.TokenType == JsonToken.Null) return null;
                var value = serializer.Deserialize<string>(reader);
                if (value == "id")
                {
                    return TypeEnum.Id;
                }
                throw new Exception("Cannot unmarshal type TypeEnum");
            }
    
            public override void WriteJson(JsonWriter writer, object untypedValue, JsonSerializer serializer)
            {
                if (untypedValue == null)
                {
                    serializer.Serialize(writer, null);
                    return;
                }
                var value = (TypeEnum)untypedValue;
                if (value == TypeEnum.Id)
                {
                    serializer.Serialize(writer, "id");
                    return;
                }
                throw new Exception("Cannot marshal type TypeEnum");
            }
    
            public static readonly TypeEnumConverter Singleton = new TypeEnumConverter();
        }
    
        internal class ParseStringConverter : JsonConverter
        {
            public override bool CanConvert(Type t) => t == typeof(long) || t == typeof(long?);
    
            public override object ReadJson(JsonReader reader, Type t, object existingValue, JsonSerializer serializer)
            {
                if (reader.TokenType == JsonToken.Null) return null;
                var value = serializer.Deserialize<string>(reader);
                long l;
                if (Int64.TryParse(value, out l))
                {
                    return l;
                }
                throw new Exception("Cannot unmarshal type long");
            }
    
            public override void WriteJson(JsonWriter writer, object untypedValue, JsonSerializer serializer)
            {
                if (untypedValue == null)
                {
                    serializer.Serialize(writer, null);
                    return;
                }
                var value = (long)untypedValue;
                serializer.Serialize(writer, value.ToString());
                return;
            }
    
            public static readonly ParseStringConverter Singleton = new ParseStringConverter();
        }
    
        internal class JsonUnionConverter : JsonConverter
        {
            public override bool CanConvert(Type t) => t == typeof(JsonUnion) || t == typeof(JsonUnion?);
    
            public override object ReadJson(JsonReader reader, Type t, object existingValue, JsonSerializer serializer)
            {
                switch (reader.TokenType)
                {
                    case JsonToken.String:
                    case JsonToken.Date:
                        var stringValue = serializer.Deserialize<string>(reader);
                        return new JsonUnion { String = stringValue };
                    case JsonToken.StartObject:
                        var objectValue = serializer.Deserialize<JsonJson>(reader);
                        return new JsonUnion { JsonJson = objectValue };
                }
                throw new Exception("Cannot unmarshal type JsonUnion");
            }
    
            public override void WriteJson(JsonWriter writer, object untypedValue, JsonSerializer serializer)
            {
                var value = (JsonUnion)untypedValue;
                if (value.String != null)
                {
                    serializer.Serialize(writer, value.String);
                    return;
                }
                if (value.JsonJson != null)
                {
                    serializer.Serialize(writer, value.JsonJson);
                    return;
                }
                throw new Exception("Cannot marshal type JsonUnion");
            }
    
            public static readonly JsonUnionConverter Singleton = new JsonUnionConverter();
        }
    

    公司

        public  class Company
        {
            [JsonProperty("id")]
            public long Id { get; set; }
    
            [JsonProperty("name")]
            public string Name { get; set; }
    
            [JsonProperty("city")]
            public string City { get; set; }
    
            [JsonProperty("ageYears")]
            public long AgeYears { get; set; }
    
            [JsonProperty("opinionPositivePercent")]
            public long OpinionPositivePercent { get; set; }
    
            [JsonProperty("opinionTotal")]
            public long OpinionTotal { get; set; }
    
            [JsonProperty("opinionTotalInRating")]
            public long OpinionTotalInRating { get; set; }
    
            /// <summary>
            /// Все товары продавца.
            /// </summary>
            [JsonProperty("urlForCompanyProducts")]
            public Uri CompanyProducts { get; set; }
            /// <summary>
            /// ?
            /// </summary>
            [JsonProperty("portalPageURL")]
            public Uri PortalPage{ get; set; }
            /// <summary>
            /// Представитель компании.
            /// </summary>
            [JsonProperty("contactPerson")]
            public string ContactPerson { get; set; }
           
            [JsonProperty("contactEmail")]
            public object ContactEmail { get; set; }
    
            [JsonProperty("mainLogoUrl({\"height\":50,\"width\":100})")]
            public Uri Logo { get; set; }
    
            [JsonProperty("webSiteUrl")]
            public Uri WebSite { get; set; }
    
            /// <summary>
            /// Отзывы
            /// </summary>
            [JsonProperty("companyOpinionsUrl")]
            public Uri CompanyOpinions { get; set; }
    
            /// <summary>
            /// Варианты доставки
            /// </summary>
            [JsonProperty("deliveryOptions({\"itemsPrices\":[],\"productIds\":[]})")]
            public DeliveryOptions DeliveryOptions { get; set; }
    
    
            [JsonProperty("phones")]
            public Phones Phones { get; set; }
    
      
            [JsonProperty("regionName")]
            public string RegionName { get; set; }
    
          
            [JsonProperty("phone")]
            public string Phone { get; set; }
        }
    
        public  class DeliveryOptions
        {
            [JsonProperty("type")]
            public string Type { get; set; }
    
            [JsonProperty("json")]
            public DeliveryOptionsItemsPricesProductIdsJson[] Json { get; set; }
        }
    
        public  class DeliveryOptionsItemsPricesProductIdsJson
        {
            [JsonProperty("type")]
            public string Type { get; set; }
    
            [JsonProperty("name")]
            public string Name { get; set; }
    
            [JsonProperty("comment")]
            public string Comment { get; set; }
    
            [JsonProperty("name_without_price")]
            public string NameWithoutPrice { get; set; }
    
            [JsonProperty("self_delivery_in_name")]
            public bool SelfDeliveryInName { get; set; }
    
            [JsonProperty("min_price")]
            public object MinPrice { get; set; }
    
            [JsonProperty("delivery_price")]
            public object DeliveryPrice { get; set; }
    
            [JsonProperty("free_delivery_price")]
            public long? FreeDeliveryPrice { get; set; }
    
            [JsonProperty("free_delivery")]
            public bool FreeDelivery { get; set; }
    
            [JsonProperty("apply_to_total")]
            public bool ApplyToTotal { get; set; }
    
            [JsonProperty("position")]
            public long? Position { get; set; }
    
            [JsonProperty("hidden")]
            public bool Hidden { get; set; }
    
            [JsonProperty("from_last_name", NullValueHandling = NullValueHandling.Ignore)]
            public bool? FromLastName { get; set; }
    
            [JsonProperty("address_street", NullValueHandling = NullValueHandling.Ignore)]
            public bool? AddressStreet { get; set; }
    
            [JsonProperty("prices")]
            public Prices Prices { get; set; }
    
            [JsonProperty("id")]
            public long Id { get; set; }
        }
    
        public  class Prices
        {
            [JsonProperty("priceTotals")]
            public object[] PriceTotals { get; set; }
    
            [JsonProperty("rawTotalPriceInDefaultCurrency")]
            public long RawTotalPriceInDefaultCurrency { get; set; }
    
            [JsonProperty("priceTotalsDefaultCurrency")]
            public long PriceTotalsDefaultCurrency { get; set; }
    
            [JsonProperty("priceTotalsInUSD")]
            public long PriceTotalsInUsd { get; set; }
    
            [JsonProperty("priceTotalsHtml")]
            public long PriceTotalsHtml { get; set; }
        }
    
        public  class Phones
        {
            [JsonProperty("type")]
            public string Type { get; set; }
    
            [JsonProperty("json")]
            public JsonUnion[] Json { get; set; }
        }
    
        public  class JsonJson
        {
            [JsonProperty("description")]
            public string Description { get; set; }
    
            [JsonProperty("number")]
            public string Number { get; set; }
        }
    
        public  struct JsonUnion
        {
            public JsonJson JsonJson;
            public string String;
    
            public static implicit operator JsonUnion(JsonJson JsonJson) => new JsonUnion { JsonJson = JsonJson };
            public static implicit operator JsonUnion(string String) => new JsonUnion { String = String };
        }
    

    扩大

    static class Ext
        {
            public static string SubstringJson(this string value, string start, string end)
            {
                var startIndex = value.IndexOf(start) + start.Length;
                var endIndex = value.IndexOf(end) - startIndex;
    
                var result = value.Substring(startIndex, endIndex).Trim();
                return result.Remove(result.Length - 1);
            }
    
            public static int SubstringId(this string url)
            {
                var result = new Uri(url).AbsolutePath.Split('-').FirstOrDefault()?.Remove(0, 2);
                return int.Parse(result);
            }
        }
    

    方法 GetJsonAsync

    static readonly HttpClient client = new HttpClient();
            private static async Task<JObject> GetJsonAsync(string url)
            {
                var html = await client.GetStringAsync(url);
                var jsonString = html.SubstringJson("window.ApolloCacheState =", "window.SPAConfig");
                return JObject.Parse(jsonString);
            }
    

    方法 ParseItem

            private async static Task ParseItem(string itemUrl)
            {
               
                var id = itemUrl.SubstringId();
    
                var json = await GetJsonAsync(itemUrl);
                var product = json[$"Product:{id}"];
                var Product = JsonConvert.DeserializeObject<Product>(product.ToString());
                
                var companyId = product["company"]["id"];
                var company = json[$"{companyId}"];
                var Company = JsonConvert.DeserializeObject<Company>(company.ToString());
                
                var builder = new StringBuilder();
                builder.AppendLine($"{Product.Name}");
    
                builder.AppendLine($"Картинки");
                foreach(var image in Product.Images.Json)
                {
                    builder.AppendLine($"{image}");
                }
                builder.AppendLine($"Цена {Product.Price} {Product.PriceCurrencyLocalized} ({Product.PriceUsd} usd.) ");
                builder.AppendLine($"Компания:{Company.Name} г.{Company.City}");
                builder.AppendLine($"Контакт: {Company.ContactPerson} {Company.Phone} {Company.WebSite}");
                
    
                Console.WriteLine(builder);
            }
    

    主要的

            static async Task Main()
            {
    
                await ParseItem("https://prom.ua/p1097470680-halat-zhenskij-dlinnyj.html");
                Console.ReadKey();
            }
    

    实现代码

    • 0

相关问题

  • 使用嵌套类导出 xml 文件

  • 分层数据模板 [WPF]

  • 如何在 WPF 中为 ListView 手动创建列?

  • 在 2D 空间中,Collider 2D 挂在玩家身上,它对敌人的重量相同,我需要它这样当它们碰撞时,它们不会飞向不同的方向。统一

  • 如何在 c# 中使用 python 神经网络来创建语音合成?

  • 如何知道类中的方法是否属于接口?

Sidebar

Stats

  • 问题 10021
  • Answers 30001
  • 最佳答案 8000
  • 用户 6900
  • 常问
  • 回答
  • Marko Smith

    如何从列表中打印最大元素(str 类型)的长度?

    • 2 个回答
  • Marko Smith

    如何在 PyQT5 中清除 QFrame 的内容

    • 1 个回答
  • Marko Smith

    如何将具有特定字符的字符串拆分为两个不同的列表?

    • 2 个回答
  • Marko Smith

    导航栏活动元素

    • 1 个回答
  • Marko Smith

    是否可以将文本放入数组中?[关闭]

    • 1 个回答
  • Marko Smith

    如何一次用多个分隔符拆分字符串?

    • 1 个回答
  • Marko Smith

    如何通过 ClassPath 创建 InputStream?

    • 2 个回答
  • Marko Smith

    在一个查询中连接多个表

    • 1 个回答
  • Marko Smith

    对列表列表中的所有值求和

    • 3 个回答
  • Marko Smith

    如何对齐 string.Format 中的列?

    • 1 个回答
  • Martin Hope
    Alexandr_TT 2020年新年大赛! 2020-12-20 18:20:21 +0000 UTC
  • Martin Hope
    Alexandr_TT 圣诞树动画 2020-12-23 00:38:08 +0000 UTC
  • Martin Hope
    Air 究竟是什么标识了网站访问者? 2020-11-03 15:49:20 +0000 UTC
  • Martin Hope
    Qwertiy 号码显示 9223372036854775807 2020-07-11 18:16:49 +0000 UTC
  • Martin Hope
    user216109 如何为黑客设下陷阱,或充分击退攻击? 2020-05-10 02:22:52 +0000 UTC
  • Martin Hope
    Qwertiy 并变成3个无穷大 2020-11-06 07:15:57 +0000 UTC
  • Martin Hope
    koks_rs 什么是样板代码? 2020-10-27 15:43:19 +0000 UTC
  • Martin Hope
    Sirop4ik 向 git 提交发布的正确方法是什么? 2020-10-05 00:02:00 +0000 UTC
  • Martin Hope
    faoxis 为什么在这么多示例中函数都称为 foo? 2020-08-15 04:42:49 +0000 UTC
  • Martin Hope
    Pavel Mayorov 如何从事件或回调函数中返回值?或者至少等他们完成。 2020-08-11 16:49:28 +0000 UTC

热门标签

javascript python java php c# c++ html android jquery mysql

Explore

  • 主页
  • 问题
    • 热门问题
    • 最新问题
  • 标签
  • 帮助

Footer

RError.com

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

帮助

© 2023 RError.com All Rights Reserve   沪ICP备12040472号-5