{"id":9469,"date":"2023-05-22T15:25:46","date_gmt":"2023-05-22T08:25:46","guid":{"rendered":"https:\/\/bap-software.net\/?post_type=products&#038;p=9469"},"modified":"2024-01-26T15:36:24","modified_gmt":"2024-01-26T08:36:24","slug":"image-captioning","status":"publish","type":"products","link":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/","title":{"rendered":"Image Captioning"},"content":{"rendered":"<p><\/p>\n<h2><strong>Problem: Image Captioning<\/strong><\/h2>\n<p>Given an image, our goal is to generate a caption.<\/p>\n<ul>\n<li>Input: Image<\/li>\n<li>Output: Caption for image<\/li>\n<\/ul>\n<h2><strong>Solution<\/strong><\/h2>\n<p>For this problem, we use will use InceptionV3 (which is pre-trained on Imagenet) to classify each image. We will extract features from the last convolutional layer. The RNN (here GRU) attends over the image to predict the next word.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"760\" height=\"368\" class=\"size-full wp-image-9471 aligncenter\" src=\"https:\/\/cdn.bap-software.net\/2023\/05\/Case-10-Solution.png\" alt=\"Case 10 - Solution\" \/><\/p>\n<h2><strong>Experimental Results<\/strong><\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"809\" height=\"339\" class=\"size-full wp-image-9470 aligncenter\" src=\"https:\/\/cdn.bap-software.net\/2023\/05\/Case-10-Experimental-Results.png\" alt=\"Case 10 - Experimental Results\" \/><\/p>","protected":false},"featured_media":13409,"template":"","product":[65],"cs_tag":[10189,10190],"class_list":["post-9469","products","type-products","status-publish","has-post-thumbnail","hentry","product-ai","cs_tag-ai","cs_tag-ai-technology"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v26.1 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>(English) Image Captioning - \u682a\u5f0f\u4f1a\u793eBAP Solution Japan<\/title>\n<meta name=\"description\" content=\"(English) For this problem, we use will use InceptionV3 (which is pre-trained on Imagenet) to classify each image. We will extract features from the last...\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/\" \/>\n<meta property=\"og:locale\" content=\"vi_VN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Image Captioning\" \/>\n<meta property=\"og:description\" content=\"(English) For this problem, we use will use InceptionV3 (which is pre-trained on Imagenet) to classify each image. We will extract features from the last...\" \/>\n<meta property=\"og:url\" content=\"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/\" \/>\n<meta property=\"og:site_name\" content=\"C\u00f4ng Ty C\u1ed5 Ph\u1ea7n \u0110\u1ea7u T\u01b0 V\u00e0 C\u00f4ng Ngh\u1ec7 BAP\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/bap32\" \/>\n<meta property=\"article:modified_time\" content=\"2024-01-26T08:36:24+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cdn.bap-software.net\/2023\/05\/26223500\/image-captioning-min-e1706258171645.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"883\" \/>\n\t<meta property=\"og:image:height\" content=\"760\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:site\" content=\"@bapsoftware\" \/>\n<meta name=\"twitter:label1\" content=\"\u01af\u1edbc t\u00ednh th\u1eddi gian \u0111\u1ecdc\" \/>\n\t<meta name=\"twitter:data1\" content=\"1 ph\u00fat\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/image-captioning\\\/\",\"url\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/image-captioning\\\/\",\"name\":\"(English) Image Captioning - \u682a\u5f0f\u4f1a\u793eBAP Solution Japan\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/image-captioning\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/image-captioning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/cdn.bap-software.net\\\/2023\\\/05\\\/26223500\\\/image-captioning-min-e1706258171645.webp\",\"datePublished\":\"2023-05-22T08:25:46+00:00\",\"dateModified\":\"2024-01-26T08:36:24+00:00\",\"description\":\"(English) For this problem, we use will use InceptionV3 (which is pre-trained on Imagenet) to classify each image. We will extract features from the last...\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/image-captioning\\\/#breadcrumb\"},\"inLanguage\":\"vi\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[[\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/image-captioning\\\/\"]]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"vi\",\"@id\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/image-captioning\\\/#primaryimage\",\"url\":\"https:\\\/\\\/cdn.bap-software.net\\\/2023\\\/05\\\/26223500\\\/image-captioning-min-e1706258171645.webp\",\"contentUrl\":\"https:\\\/\\\/cdn.bap-software.net\\\/2023\\\/05\\\/26223500\\\/image-captioning-min-e1706258171645.webp\",\"width\":883,\"height\":760,\"caption\":\"Image Captioning\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/image-captioning\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Case Study\",\"item\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/case-study\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Image Captioning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/#website\",\"url\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/\",\"name\":\"C\u00f4ng Ty C\u1ed5 Ph\u1ea7n \u0110\u1ea7u T\u01b0 V\u00e0 C\u00f4ng Ngh\u1ec7 BAP\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/bap-software.net\\\/vi\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"vi\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"(English) Image Captioning - \u682a\u5f0f\u4f1a\u793eBAP Solution Japan","description":"(English) For this problem, we use will use InceptionV3 (which is pre-trained on Imagenet) to classify each image. We will extract features from the last...","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/","og_locale":"vi_VN","og_type":"article","og_title":"Image Captioning","og_description":"(English) For this problem, we use will use InceptionV3 (which is pre-trained on Imagenet) to classify each image. We will extract features from the last...","og_url":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/","og_site_name":"C\u00f4ng Ty C\u1ed5 Ph\u1ea7n \u0110\u1ea7u T\u01b0 V\u00e0 C\u00f4ng Ngh\u1ec7 BAP","article_publisher":"https:\/\/www.facebook.com\/bap32","article_modified_time":"2024-01-26T08:36:24+00:00","og_image":[{"width":883,"height":760,"url":"https:\/\/cdn.bap-software.net\/2023\/05\/26223500\/image-captioning-min-e1706258171645.webp","type":"image\/webp"}],"twitter_card":"summary_large_image","twitter_site":"@bapsoftware","twitter_misc":{"\u01af\u1edbc t\u00ednh th\u1eddi gian \u0111\u1ecdc":"1 ph\u00fat"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/","url":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/","name":"(English) Image Captioning - \u682a\u5f0f\u4f1a\u793eBAP Solution Japan","isPartOf":{"@id":"https:\/\/bap-software.net\/vi\/#website"},"primaryImageOfPage":{"@id":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/#primaryimage"},"image":{"@id":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/#primaryimage"},"thumbnailUrl":"https:\/\/cdn.bap-software.net\/2023\/05\/26223500\/image-captioning-min-e1706258171645.webp","datePublished":"2023-05-22T08:25:46+00:00","dateModified":"2024-01-26T08:36:24+00:00","description":"(English) For this problem, we use will use InceptionV3 (which is pre-trained on Imagenet) to classify each image. We will extract features from the last...","breadcrumb":{"@id":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/#breadcrumb"},"inLanguage":"vi","potentialAction":[{"@type":"ReadAction","target":[["https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/"]]}]},{"@type":"ImageObject","inLanguage":"vi","@id":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/#primaryimage","url":"https:\/\/cdn.bap-software.net\/2023\/05\/26223500\/image-captioning-min-e1706258171645.webp","contentUrl":"https:\/\/cdn.bap-software.net\/2023\/05\/26223500\/image-captioning-min-e1706258171645.webp","width":883,"height":760,"caption":"Image Captioning"},{"@type":"BreadcrumbList","@id":"https:\/\/bap-software.net\/vi\/case-study\/image-captioning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/bap-software.net\/vi\/"},{"@type":"ListItem","position":2,"name":"Case Study","item":"https:\/\/bap-software.net\/vi\/case-study\/"},{"@type":"ListItem","position":3,"name":"Image Captioning"}]},{"@type":"WebSite","@id":"https:\/\/bap-software.net\/vi\/#website","url":"https:\/\/bap-software.net\/vi\/","name":"C\u00f4ng Ty C\u1ed5 Ph\u1ea7n \u0110\u1ea7u T\u01b0 V\u00e0 C\u00f4ng Ngh\u1ec7 BAP","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/bap-software.net\/vi\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"vi"}]}},"_links":{"self":[{"href":"https:\/\/bap-software.net\/vi\/wp-json\/wp\/v2\/products\/9469","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/bap-software.net\/vi\/wp-json\/wp\/v2\/products"}],"about":[{"href":"https:\/\/bap-software.net\/vi\/wp-json\/wp\/v2\/types\/products"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/bap-software.net\/vi\/wp-json\/wp\/v2\/media\/13409"}],"wp:attachment":[{"href":"https:\/\/bap-software.net\/vi\/wp-json\/wp\/v2\/media?parent=9469"}],"wp:term":[{"taxonomy":"product","embeddable":true,"href":"https:\/\/bap-software.net\/vi\/wp-json\/wp\/v2\/product?post=9469"},{"taxonomy":"cs_tag","embeddable":true,"href":"https:\/\/bap-software.net\/vi\/wp-json\/wp\/v2\/cs_tag?post=9469"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}